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Welcome Note 


E. Annamalai 
Director (retired), Central Institute of Indian Languages, Mysore 
Professor of Tamil (retired), The University of Chicago, Chicago 


The languages of South Asia are studied by scholars from various 
disciplinary points of view and with various disciplinary tools. These 
languages have a long history and have troves of written texts of 
many centuries, whose study contributes to historical linguistics and 
philological studies. The major languages of South Asia have a long 
history of grammatical descriptions and philosophical investigations 
from the ancient times. The richness of the South Asian languages 
in all the above aspects is unparalleled in any region of the world. 

The current journals devoted to promote scholarship on South 
Asian languages, however, are fragmented, being focused on par- 
ticular specialisations or periods. The new journal Bhasa. Journal of 
South Asian Linguistics, Philology and Grammatical Traditions is an ap- 
preciated alternate in this scenario. It is particularly welcome for the 
scholars of languages working in South Asia who generally combine 
the knowledge of modern grammatical theories and traditional gram- 
mars, and of language descriptions and language histories. This new 
journal provides them with an avenue to break the existing bounda- 
ries between linguistics, philology and grammatical traditions. 

I welcome such scholars to contribute their research work to the 
Journal and to build bridges between various aspects of their lan- 
guage of study through their publication. 
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The Range of Bhasa 


Hans Henrich Hock 
Professor Emeritus of Linguistics and Sanskrit at the University of Illinois, 
Urbana-Champaign, USA 


It is a great pleasure to welcome the first issue of a new journal ded- 
icated to the study of South Asian languages in the broadest sense 
possible. 

During the 20th century the earlier combination of textual studies 
(philology), investigation of indigenous grammatical traditions, and 
what may be called 'pure linguistics' has increasingly become sev- 
ered. True, practicing historical-comparative linguists continue to do 
a fair amount of philological work on the texts from which they draw 
their data, and many of them go beyond, in terms of detailed textu- 
al studies. Some also still study the ancient grammarians, either in 
their own right or in terms of what insights they may offer for gener- 
al linguistic studies. But academic institutions and journals tend to 
separate linguistics from philology, and the study of ancient gram- 
matical traditions tends to fall between the chairs. 

It is gratifying, therefore, that Bhasa WTST aims to bring philology, 
linguistics, and the study of Indian grammatical traditions togeth- 
er, under one roof. The fact that it does so in an open-access format 
adds further reasons for appreciation and gratitude. 

In fact, the contributions to this first issue already manifest much 
ofthe intended breadth of the Journal, ranging from 'purely' linguis- 
tic investigations on South Asian negation and predicative posses- 
sives in Hindi, to a study of sarcasm in Kashmiri, to the ideology 
underlying the Sanskrit revival movement and speakers' declara- 
tions in the Indian Census, to - last, but by no means least - a 'pure- 
ly’ philological investigation on the Pali Milindapanha and its Chi- 
nese counterparts. 
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It is to be hoped that future issues continue in this broad direc- 
tion and further increase the breadth by addressing issues in all of 
the languages and language families of South Asia, whether linguis- 
tic, philological, or focused on the different grammatical traditions. 

The Editor-in-Chief, Professor Andrea Drocco and his home insti- 
tution, the Department of Asian and North African Studies of Ca' Fos- 
cari University of Venice, deserve our deep gratitude for having em- 
banked on this ambitious voyage. 


GITA MSTA I HTSTSTTOR dda a | 
Ufa FETTA SATTA Me: STAR Il 
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Welcome Bhasa 


Silvia Luraghi 
Università degli Studi di Pavia, Italia 


It is always exciting to see a new outlet for linguistic publications 
opening up opportunities for scholarly exchange, especially one 
which is explicitly engaged in the mission of bringing together lin- 
guistic, philological and grammatical studies. Cross-disciplinary re- 
search is of paramount importance in a field such as the one that 
Bhasa aims to cover, the languages of South Asia, in which linguis- 
tic, philological and grammatical studies can all count on a long and 
rich tradition, with a visible, and regrettable, lack of communication 
that has remained quite steady over time. The result of mutual igno- 
rance is often a duplication of efforts by scholars that work in neigh- 
bouring fields, and could profit from the advancements of colleagues 
working on the same topic but within different traditions and with 
different means of investigation. Fostering communication, on the 
other hand, widens the opportunity to build on results that have al- 
ready been achieved, resulting in a growth that ultimately promotes 
research irrespective of the separate tradition to which individual 
scholars belong. Building bridges is not an easy task when it is con- 
fronted with well established habits, which sometimes bring philol- 
ogists to be suspicious of linguists or the other way around. In this 
respect, the first issue of Bhasa is up to its commitment, as it com- 
prises a wide range of studies covering both philological and linguis- 
tic research, as witnessed by the papers by Bryan De Notariis on the 
reconstruction of the archetype of a Buddhist manuscript on the one 
hand, and by Lucrezia Carnesale concerning Hindi possessive con- 
structions on the other hand. In addition, Patrick McCartney address- 
es a sociolinguistic issue connected with the speakers attitudes as 
witnessed by the 2001 and the 2011 censuses. Finally, two other es- 
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sentially linguistic contribution, one by John Peterson and Lennart 
Chevallier and the other by Peter Hook and Omkar Koul, tackle their 
topics from the point of view of typological comparison (negation in 
South Asian languages) and of areal linguistics (sarcasm in Kashmi- 
ri as an areal feature), covering a large spectrum of languages from 
different language families that reflects the linguistic wealth of the 
area. For these reasons, professor Andrea Drocco, the founder and 
editor of Bhasa, deserves the gratitude of all scholars interested in 
various ways in the languages of South Asia. 
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Editorial 


Andrea Drocco 
Università Ca' Foscari Venezia, Italia 


It is widely recognised that Sanskrit and more generally the study 
of the languages of South Asia played an extremely important role 
in the development of historical and comparative linguistics during 
the nineteenth century. Besides, it has been witnessed the same re- 
garding the study of Indian grammatical tradition, especially San- 
skrit grammatical tradition, which sees Pànini as its greatest expo- 
nent and representative. However, until the end of the first decade 
of twenty-first century, no single journal in Western countries has 
been devoted to the linguistics of these languages. This is a relevant 
fact, considering the importance of old Indo-Aryan and Dravidian 
languages for the study of diachronic linguistics, grammatical tra- 
dition and/or textual criticism - thanks also to the multitude of texts 
representing the various phases of linguistic evolution of these lan- 
guages - and considering the high linguistic diversity of South Asia 
and, consequently, its importance for the study of multilingualism, 
language contact and language policy and planning. With the excep- 
tion of Indian Linguistics, the quarterly journal (produced annually) 
of the Linguistic Society of India, only recently, in fact, specific aca- 
demic journals started being published whose focus is the linguistic 
analysis of South Asian languages. This is the case of the Journal of 
South Asian Linguistics which is an online and open access publica- 
tion edited by Sameerud ud Dowla Khan and Emily Manetta (initially 
by Miriam Butt and Rajesh Bhatt) that, from 2008 onward, publish- 
es original research articles and book reviews. John Benjamins Pub- 
lishing Company (Amsterdam and Philadelphia) publishes the Journal 
of South Asian Languages and Linguistics the first volume of which 
came out in 2014. The focus of the journal is on descriptive, function- 
al and typological investigations, but descriptive studies are also en- 
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couraged to the extent that they present analyses of lesser-known 
languages, based on original fieldwork. The editor-in-chief is Leonid 
Kulikov (initially Anju Saxena). 

It is important to note that the scope of these journals is not a sin- 
gle group/family of South Asian languages, but, on the contrary, all 
languages used, today and in the past, in the region and, moreover, 
the South Asian linguistic landscape as a whole. Certainly, this is 
the result of a changing approach in studying these languages. As a 
matter of fact, according to this relatively new approach, in order to 
have a correct view not only on the characteristics of the contempo- 
rary languages of this region, but also of what their evolution must 
have been, even recently, it is necessary to consider language con- 
tact as an essential factor of language change in the Indian sub-con- 
tinent. As Hock (2016, 1) has pointed out, this change of perspec- 
tive has been a consequence of the fact that this region started to 
be studied, from the point of view of linguistics, as a linguistic area. 
Emeneau's papers (1980) on this topic were fundamental in this re- 
spect (especially Emeneau 1956), followed in importance by Masica's 
study of 1976. Thence, over the past five-six decades there has been 
a growing interest in the study of South Asian languages from differ- 
ent perspectives and adopting various methodological approaches. 
More recently, this interest has also expanded to the field of endan- 
gered and unwritten languages to the extent that it has now become 
an established field of research for various academic projects around 
the World (cf., for example, the Himalayan Languages Project, as well 
as the various studies, grammars and/or documentation works pub- 
lished in the context of the Hauns Rausing Endangered Languag- 
es Project and Living Tongues Institute for Endangered Languages). 

Notwithstanding this ferment in the discipline with the increas- 
ingly presence of a new generation of established scholars, in some 
countries the impact of linguistic studies on South Asian languages 
has not been as relevant as it has been in other countries. In this re- 
gard, Italy is a case in point. As a matter of fact, after the pioneer- 
ing work of some eminent scholars dealing with Middle Indo-Aryan 
or New-Indo-Aryan languages (the emblematic example is represent- 
ed by Luigi Pio Tessitori, in particular by his well-known Grammar of 
the Old Western Rájasthánt With Special Reference to... 1914-16), the 
few Italian linguists who studied some of the various languages of 
South-Asia have devoted their attention to the analysis of Vedic, es- 
pecially from an Indo-European perspective, or Sanskrit grammat- 
ical tradition. Unfortunately, with rare and recent exceptions, most 
of these few studies are written in Italian language, thus difficult to 
access for foreign scholars. 

It is precisely in this context that Bhasa. Journal of South Asian 
Linguistics, Philology and Grammatical Traditions has been launched. 
Starting from an idea of Antonio Rigopoulos and mine and patronised 
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by the Department of Asian and North African Studies of Ca' Foscari 
University of Venice, this new peer-reviewed, international research 
journal on the languages of the Indian subcontinent welcomes sub- 
missions adopting evidence-based approaches to all areas of linguis- 
tics related to South Asian literary (classical and modern/contempo- 
rary), spoken and/or endangered languages. The Journal is published 
online by Edizioni Ca' Foscari, which was created in 2011 with the 
aim of encouraging dissemination of research results within the Uni- 
versity and from here to both the national and international scientif- 
ic community. For this reason, all publications are made available 
online with free and open access in order to bolster and encourage 
the free sharing of knowledge. 

The main purpose of the Journal is twofold. 

On the one hand, to collect papers devoted to general synchronic 
linguistic analysis (including sociolinguistic analysis) of a particu- 
lar language (even as testified in a specific text) and even those in- 
corporating a comparative analysis with other languages. Accord- 
ing to this aim, the papers submitted can have a theoretical and/or 
a descriptive approach, in both respects adopting a typological and 
functional perspective. 

On the other hand, one of the major aims of Bhasa is to better un- 
derstand the evolution of the various languages employed in South 
Asia today and in the past. The term 'evolution' is here understood 
from the point of view of linguistic history - according to a pure di- 
achronic linguistic perspective - as well as from the point of view of 
the history of these languages, concerning thus the dynamics exist- 
ing between a specific language and the culture and socio-political 
context of the society where this language is spoken. For this rea- 
son, the Journal also includes in its scope the analysis of the histo- 
ry of reading and reception studies in South Asia and, accordingly, 
articles focusing on textual details and criticism and on the history 
of manuscript traditions and circulation will also be considered. In 
fact, it is our firm belief that only through the study of the relation- 
ship occurring between the languages of 'texts', with their marked 
bias toward the educated classes, and their variants (the languag- 
es of non-[canonised] texts) that it is possible to understand the so- 
ciolinguistic relationship between a more standard, established lan- 
guage and a plethora of sub-standard ‘languages’. Above all, we are 
wondering how the studies of philology and historical sociolinguis- 
tics can come to the aid of historical linguistics analysis. Therefore, 
the Journal is not solely devoted to the 'pure' linguistic study of South 
Asian languages with a synchronic or a diachronic approach. Indeed, 
one of the main purposes of this Journal is to transcend the old di- 
chotomy of synchrony and diachrony and to combine philology and 
linguistics through some of the papers that will be submitted. Last 
but not the least, particular emphasis is also placed on the study of 
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grammatical traditions that have thrived in the South Asian regions. 

A considerable part of Bhasa. Journal of South Asian Linguistics, 
Philology and Grammatical Traditions will be devoted to reviews of 
new books or specific important papers related in some way to the 
aims of the Journal. The Journal predominantly publishes articles 
in English but will occasionally also publish in Italian, French and 
German. 

In this first issue, Bhasa offers to the academic reader five inno- 
vative papers that show the results of some of its research themes. 
We look forward to future article's submissions mostly dealing with 
research themes related to the topics of this new Journal, but also, 
and especially, focusing on the whole of South Asian languages and 
on the Indian grammatical tradition - i.e. Sanskrit, Prakrit, Pali, Ta- 
mil, etc. - as well as on the grammatical tradition of Modern Indi- 
an languages. 

The paper by Peterson and Chevallier offers a very detailed (even 
if preliminary according to the authors) typology of negation based 
on a database of 25 structural features for 39 languages from three 
language families and two language isolates providing thus a first 
analysis of the distribution of these different negative-marking strat- 
egies throughout the subcontinent. In some cases of language con- 
tact, the data also allows us to determine with some certainty the 
type of contact situation that has led to the negative-marking pat- 
terns documented. 

The paper by Hook and Koul explores a particular formulaic con- 
struction specifically dedicated to the expression of sarcasm in many 
Indo-Aryan languages. Even ifthe focus ofthe two authors is on a par- 
ticular language, i.e. Kashmiri, their aim is to show that this particu- 
lar construction is attested non only in the majority of Indo-Aryan lan- 
guages, but also in some but not all of the major Dravidian languages. 
As a consequence, they suggest that this specific construction is an- 
other clear example of 'trait' of South Asia as a linguistic area. 

The article by Patrick McCartney deals with 2001 and 2011 Cen- 
sus of India data in which L1-L3 (first to third language) Sanskrit to- 
kens were returned during census enumeration. The main goal of 
the paper, that is part of the Imagining Sanskrtland project, is on lo- 
cating and documenting how, where, and why the most important lit- 
erary Indo-Aryan language, Sanskrit, is spoken in the twenty-first 
century. In particular, a theo-political discussion of Sanskrit's im- 
aginative power for faith-based development is provided, in order to 
discuss how ‘Sanskrit-speaking’ villages signify an ambition toward 
cultural renaissance. 

The article by De Notariis is conceived as an introduction to ques- 
tions concerning the relationship between various versions of a Bud- 
dhist text known in its Pali variant as Milindapanha, and in its Chi- 
nese versions as Nàxian biqiu jing (Att; T 1670 versions A and 
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B). With respect to the latter, particular attention to the Western re- 
ception and the problem related to the reconstruction of a possible 
archetype, adopting the guidelines provided by Gérard Fussman, 
are provided. 

Carnesale's paper deals with the semantic/pragmatic-syntactic in- 
terplay of Hindi predicative possessive constructions, especially tak- 
ing into account the concept of linguistic iconicity. Therefore, the aim 
of the paper is to show that each possessive construction in the larg- 
est new Indo-Aryan language of modern South Asia is customised to 
encode specific semantic properties. 


I wish to thank the members of the Editorial Board, in particular E. 
Annamalai, Hans Henrich Hock and Silvia Luraghi for their welcome 
notes, the reviewers of this first issue, the entire staff of Edizioni 
Ca' Foscari, and the Department of Asian and North African Studies 
of Ca' Foscari University of Venice. 
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Christian-Albrechts-Universitàt zu Kiel, Deutschland 
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Abstract The present study is a preliminary typology of negation in South Asian lan- 
guages, based on a database of 25 structural features for 39 languages from three language 
families and two language isolates. The documented features include the form of the 
negative marker, the relation ofthe negative construction to the corresponding affirmative 
form, whether there are different negative constructions used in different TAM categories, 
and whether these constructions are symmetric or asymmetric. This study also provides a 
first analysis of the distribution of these different negative-marking strategies throughout 
the subcontinent and suggests that a combination of both family bias and areal pressure 
are needed to account for many of the observed distributions. In some cases of language 
contact, the data also allows us to determine with some certainty the type of contact 
situation which has led to the negative-marking patterns documented in the database. 
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eal linguistics. 
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1 Introduction 


In his study of negation in South Asian languages, Bhatia (1995, 13) 
provides a few examples of negative marking in Hindi and five oth- 
er South Asian languages. For example, consider his Hindi examples, 
given here in (1) (gloss and order of sentences altered). 


(1a) vo nahî jà-egà. 
3SG NEG.IND gO-FUT.3SG.M 
‘He won't go’. 
(1b) ta mat ja. 
2SG NEG.NH.IMP go.NH.IMP 
‘Do not go’. 
(1c) kya vo na ja-e? 
q 3SG NEG.SUBJ gO-SUBJ.3SG 
‘May he not go?’ 


What all three examples in (1) have in common is that they are all ne- 
gated by a preverbal particle whose form depends on the mood of the 
clause: indicative negation is indicated by the particle nahî (1a), which 
Bhatia (1995, 16) derives from a fusion ofthe negative marker na and 
the copula ahi.* The non-honorific imperative is negated by mat (1b) 
while the subjunctive is negated by the negative particle na (1c). This 
is similar in many respects to what we find in Sanskrit (Old Indo-Ary- 
an), where the preverbal particle ma is used to negate the imperative, 
with the likewise preverbal particle na found elsewhere (e.g. Whitney 
1889, 413, § 1122c). However, this type of negation, with a non-in- 
flecting negative particle preceding a verb and two or three modally 
determined distinctive forms, is by no means the only negating strat- 
egy in South Asian languages, as we will show in the following pages. 

The primary goal of the present study is to document as much 
of the impressive array of negative marking in the languages of the 
South Asian mainland as possible, based on our current database.? 


1 Although from a purely formal viewpoint it could also derive from na 'NEG' and 
=hi ‘=Foc’, with nasalisation later spreading into the second syllable as this form 
lexicalized, yielding nahi. For a similar development already in Sanskrit, see Whit- 
ney 1889, 413, 8 1122e. 


2 This work represents a continuation of our ongoing areal-typological research of 
the languages of South Asia, originally sponsored by the German Research Council 
(DFG). The earlier project, whose database has been extended here to include a nega- 
tion, was "Towards a linguistic prehistory of eastern central South Asia (and beyond)", 
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For practical reasons, in the present work we exclude languages of 
the Trans-Himalayan (Tibeto-Burman) and Tai-Kadai groups as well 
as languages spoken outside of the mainland (e.g. Dhivehi, Sinhala, 
Nicobarese etc.), although these will eventually be added to our da- 
tabase. The study is therefore still very much a work in progress and 
as the database increases and takes further languages and features 
into account, the picture will undoubtedly change somewhat. Howev- 
er, as we show below, this study already provides a detailed overview 
of negation strategies and their relation to the corresponding affirm- 
ative categories in ca. 1096 of the languages of sub-Himalayan main- 
land South Asia, so that we believe that many ofthe distributional ten- 
dencies outlined in the following pages are of substantial relevance. 

We restrict ourselves in this study to formal features of negation 
such as the form of the negative marking itself, the relation of the 
negative construction to the corresponding affirmative form, and also 
which TAM categories the respective negative constructions are found 
in. What we will not deal with here, however, are the semantic and 
pragmatic aspects of negation, which e.g. Bhatia (1995) deals with in 
his study. As negation is such a complex topic, it is not feasible to be- 
gin by investigating all aspects of it at the same time, at least not if 
the goal is to conduct a more-or-less representative survey. Thus, while 
Bhatia (1995) deals in considerable detail with marking patterns but 
also with semantic and pragmatic aspects of negation, his study is re- 
stricted to six languages - five Indo-Aryan and one Dravidian. In con- 
trast, we deal here only with formal aspects of negation but in 39 lan- 
guages from four major stocks and two isolates, allowing us to give a 
much broader picture of the various negative strategies found in these 
languages, albeit at the expense of pragmatic and semantic aspects. 

The second goal of the present study is to use this information on 
marking strategies, to the extent possible, to help us identify past ar- 
eas of language contact and the different types of contact situations 
which likely underlie these patterns. Innovations in the field of lan- 
guage typology since the early 1990s now allow us to use areal-typo- 
logical methods to delve much deeper into linguistic prehistory than 
was previously possible (e.g. Nichols 1992; 1997), and more recent 
works in fields such as sociolinguistic typology (e.g. Trudgill 2011) 
and others often allow us to determine what type of contact likely pre- 
vailed in earlier times, e.g. prolonged societal bilingualism, language 
learning by large numbers of adult learners etc. 

This study is structured as follows: $ 2 presents a brief discussion of 
language contact in South Asia, which is often referred to as a Sprach- 
bund or "linguistic area', somewhat incorrectly in our view. Instead, 


DFG project 326697274. We gratefully acknowledge our indebtedness to the German 
Research Council for funding this research. 


19 


Bhasha e-ISSN 2785-5953 
1, 1, 2022, 17-62 


John Peterson, Lennart Chevallier 
Towards a Typology of Negation in South Asian Languages 


we take a more differentiated view of language contact here and ar- 
gue that the type of contact phenomena which is generally thought 
to constitute a linguistic area is in fact only one possible outcome of 
language contact, one which however is not supported by the data in 
South Asia. This is followed in 8 3 by a brief discussion of our sample 
in 8 3.1 and a detailed discussion of the features in the database in 
$ 3.2, which largely follows the distinctions made in the crosslinguis- 
tic typological study of negation in Miestamo (2005), although we de- 
viate occasionally from the methods in that study, as our goals here 
differ somewhat from Miestamo's. Then, in $ 3.3, we briefly address 
'zero negation', found in Dravidian. 

In 8 4 the results of our study are discussed, concentrating primari- 
ly on the various language clusters in the data. The significance of this 
data is assessed in $ 5, where we discuss which clusters are likely the 
result of language contact and what type of contact may be responsi- 
ble for the patterns we observe. Finally, 8 6 provides a summary ofthe 
present study and mentions a number of topics for future research. 


2 Language Contact. South Asia as a ‘Linguistic Area’? 


Typological similarities among South Asian languages belonging to dif- 
ferent stocks were noted at least as early as Bloch (1934, 322-8), al- 
though the real momentum in research on language convergence in 
South Asia began with Emeneau (1956), who brought the spread of 
a number of features throughout much of the subcontinent to the at- 
tention of a larger linguistic audience. In the years that followed, nu- 
merous further features were suggested by various authors, many of 
which are summarised in Masica's (1976) landmark work on South 
Asia as a linguistic area. Masica's study expands the scope of research 
on South Asia as a "linguistic area" to include all of Eurasia and much 
of Africa, in order to determine to what extent South Asia differs lin- 
guistically from neighbouring regions. This is important since assum- 
ing that South Asia is a linguistic area in any meaningful sense of the 
term implies that it exhibits linguistic traits which distinguish it from 
its neighbours, something that the data however does not support. 
Ebert (2006) comes to a very similar conclusion and also calls at- 
tention to a typological division of South Asia into two different zones, 
an eastern and a western, with the line of divide at about the 84th 
meridian, cutting Bihar, Jharkhand, Odisha and northeastern India off 
from the western subcontinent. More recent work on language con- 
tact in South Asia confirms this major typological schism, although 
not necessarily based on the same features as Ebert uses. For exam- 
ple, Peterson (2017); Ivani, Paudyal, Peterson (2021) and Borin et al. 
(2021) all call attention to structural differences distinguishing east- 
ern and western Indo-Aryan from one another, which Peterson (2017) 
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refers to as the "Indo-Aryan east-west divide", with e.g. split-ergativ- 
ity found in most western languages, whereas it is largely lacking in 
eastern IA languages; similarly, arbitrary gender is typically found in 
western languages, while eastern languages usually lack it etc. Such 
a divide, cutting right through the subcontinent and creating two ty- 
pologically distinct regions, of course contradicts the very notion of a 
homogeneous linguistic area in the subcontinent. 

If in fact anything like a South Asian linguistic area really does ex- 
ist it would seem that it best fits what Campbell (2017, 27) refers to 
as a "trait-sprawl area" or “TSA”. In this type of contact area, some 
features are found 


crisscrossing some languages while others crisscross other lan- 
guages, with some extending in one direction, others in another 
direction, with some partially overlapping others in part of their 
distribution but also not coinciding in other parts of their geograph- 
ical distribution. 


This is in stark contrast to the "linguistic area sensu stricto" or "LASS", 
in which features are shared across the languages of a clearly delim- 
ited geographical area (Campbell 2017, 28). 

Many researchers of language contact in South Asia appear to be 
looking for a list of features with which they can define a "LASS"-type 
area, in which ideally all South Asian languages share all of these 
traits. However, the facts clearly support a more “TSA’-like language 
area, in which certain features are found in many languages but the 
individual features do not all show the same geographical distribu- 
tion. There may be "LASS"-type areas in South Asia, but if so these are 
likely to be found at the micro-level, which has been the focus of stud- 
ies on language contact in South Asia in recent years (e.g. Abbi 1997; 
Ebert 1993; 1999; Osada 1991; Peterson 2010; 2015; Saxena 2015). 
We will therefore not look for signs of a larger 'South Asian linguis- 
tic area' here but will instead point out what appear to be contact-in- 
duced phenomena where these are suggested by the data. 

In addition to identifying likely contact-induced areal patterns, we 
also hope to determine the societal conditions which led to the pat- 
terns we observe in the data. For example, recent works in sociolin- 
guistic typology (e.g. Trudgill 2011) show that certain linguistic struc- 
tures are more likely to emerge from one type of contact situation than 
from another. Simplifying somewhat, the argumentation in Trudgill 
(2011) which is relevant for our analysis can be summarised under 
the two following types: 

* when a large percentage of speakers of a particular language 

are adult learners, this often leads to phonological and morpho- 
logical simplifications in that language; 
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* in contrast, long-term societal bilingualism, especially in cases 
where speakers learn their second language during childhood, 
often leads to complexity. 


"Simplification" involves the following three processes (Trudgill 2011, 
20-2): 

* the regularisation of irregularities, e.g. in English cows as the 
plural of cow, instead of earlier kine; 

* anincrease in lexical and morphological transparency, e.g. twice 
and went are less transparent than two times and did go, so re- 
placing the former by the latter represents an increase in trans- 
parency; 

* loss of redundancy, of which there are two types: a. syntagmat- 
ic redundancy or the repetition of grammatical information, e.g. 
grammatical agreement on adjectives; b. paradigmatic redun- 
dancy or the morphological expression of grammatical catego- 
ries, such as number, case, tense, aspect, voice, mood, person, 
and gender. 


"Complexification" is essentially the opposite of simplification and in- 
volves the following processes (Trudgill 2011, 62): 

* irregularisation; 

* increase in opacity (less transparency); 

* increase in syntagmatic redundancy; 

* addition of morphological categories. 


As noted above, "complexification" can arise from long-term, stable 
language contact in which both languages are learned predominant- 
ly by children, as opposed to adult learners. This primarily concerns 
the addition of morphological categories in such contact situations, 
where new categories are copied from one or more neighbouring lan- 
guages into another language, but which do not replace other catego- 
ries but rather are then found in addition to these (Trudgill 2011, 27). 

Many of these tendencies have been confirmed in quantitative stud- 
ies (e.g. Bentz, Winter 2013; Sinnemaki 2009; Sinnemaki, Di Gar- 
bo 2018 among others), and the underlying assumptions of Trudgill 
(2011) have also been used to try to unravel prehistorical settlement 
patterns in South Asia (e.g. Peterson 2022). The present study repre- 
sents a further step in this direction. 
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3 The Sample and the Database 


In this section we discuss the choice of languages in our sample (8 3.1) 
as well as the features in our database ($ 3.2). 


3.1 TheSample 


The major difference between our study and Miestamo (2005), the most 
exhaustive typological study of negation we know of, is with respect to 
the sample on which it is based. Miestamo (2005) aims to be a repre- 
sentative and areally and genealogically balanced database of negation 
in human languages. As such, that sample has been compiled taking ge- 
nealogical and areal biases into account. In contrast, our primary aim 
is to describe negation in as many languages from as many regions and 
language families in mainland South Asia as possible with the second- 
ary goal of identifying signs of language contact in the data. For practi- 
cal reasons, we have not yet been able to include Trans-Himalayan and 
Tai-Kadai languages in our database, but we hope to add languages of 
these two families soon. Our sample is therefore of an entirely different 
nature than Miestamo's and is basically one of convenience, essential- 
ly using any grammars for any of these languages which were detailed 
enough for us to get the necessary information on negation for the re- 
spective language, although every attempt has been made to include 
as many grammars from all branches of all families as possible. The 
present study provides an overview of the database in its current form. 

For each language, the database currently contains only one varie- 
ty. For languages which have a well described standard variety, such 
as Hindi or Kannada, it is this variety which we have documented. For 
others, such as Kharia, Northwestern Kolami, Gta? etc., it is the vari- 
ety described in the grammar which we used. We hope to add further 
(dialectal) varieties at a later date. 

Unfortunately, negation is not dealt with in equal detail in the gram- 
mars we consulted, so that not all of our questions could be answered 
definitively for all languages. In order to maintain a consistent level 
of representativeness for all of the languages contained in our sam- 
ple, we therefore excluded all languages from our database for this 
study for which we did not have sufficient data; our lower limit for in- 
clusion in the present study was set at at least 6696 of the features in 
the database (see $ 3.2). With presently 25 features in the database 
to be described, this means that data for at least 17 features (= 68%) 
was required for the inclusion of the respective language in our sam- 
ple. This narrowed the database down to 39 languages. These 39 lan- 
guages and their respective genealogical information are given in Ap- 
pendix A. Their approximate locations, which have been taken from 
Glottolog (Hammarstròm et al. 2021) and mapped with the help of 
lingtypology (Moroz 2017), are given in Appendix B. 
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Despite not being completely balanced, the study nevertheless in- 
cludes languages from all three major families other than Trans-Hima- 
layan and Tai-Kadai, i.e. Indo-European, with 15 Indo-Aryan languages 
and one Iranian language (Balochi, spoken in Pakistan), 11 Dravidian 
languages, 9 Munda languages, and the two isolates Nihali and Ku- 
sunda. It also includes languages from all major branches of the three 
major language families. Furthermore, with 39 languages, the sample 
contains data for ca. 11.496 of the 341 languages of South Asia (i.e. In- 
dia, Bangladesh, Bhutan, Nepal and Pakistan) without the "Sino-Tibet- 
an" and "Kra-Dai" languages of South Asia as listed in the Ethnologue 
(Eberhard, Simons, Fennig 2021) and thus provides a good overview 
of the various negative-marking strategies found in the region. 


3.2 TheDatabase 


In his studies of negation, Miestamo (2005; 2013a; 2013b) deals with 
negative marking and its relation to affirmative constructions from a 
typological perspective, and we largely follow him in the present study. 
We therefore begin with a brief introduction to the central concepts 
relevant to negation and the distinctions which Miestamo makes and 
in which we follow him, while also discussing the differences between 
our study and his with respect to the database. Miestamo defines a 
"standard negation" or "SN" construction as follows: 


A SN construction is a construction whose function is to modify a ver- 
bal declarative main clause expressing a proposition p in such a way 
that the modified clause expresses the proposition with the opposite 
truth value to p, i.e. ~p, or the proposition used as the closest equiv- 
alent to ~p in case the clause expressing ~p cannot be formed in 
the language, and that is (one of) the productive and general means 
the language has for performing this function. (Miestamo 2005, 42) 


We follow Miestamo's definition of standard negation in the present 
study but expand the object of our investigation to include negative im- 
peratives and other negative non-indicative categories as well as sup- 
pletive negative copular verbs to check these for potential areal clus- 
ters. We also include a discussion of the so-called 'zero negation' in 
Dravidian, in $ 3.3, as it appears to be unique in the languages of the 
world (e.g. Miestamo 2010; Pilot-Raichoor 2011) and as such should 
not be lacking in a discussion of negation in South Asian languages.? 


3 However, as zero-negation presently only occurs in Kannada in our sample, it is not 
yet included in our database but can be added at a later date as more languages are in- 
corporated into the database. 
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The following discussion illustrates the individual distinctions made 
in our database with languages from our sample. In doing so, it also il- 
lustrates the types of negative constructions found in our data. While 
not all distinctions are illustrated here, all major negation types are 
illustrated, as well as some minor but common variations of these dif- 
ferent types. It should therefore be sufficient to give the reader a gen- 
eral impression of the different negative constructions found in the 
subcontinent south of the Himalayas. 

The primary distinction with respect to negative marking and its 
relation to affirmative marking is what Miestamo refers to as symmet- 
ric vs asymmetric structures. Symmetric structures are those which 
show no structural differences between the affirmative and the neg- 
ative constructions other than the addition of the negative marker(s) 
in negation. A simple illustration of this is given in example (2) from 
Sadri (Indo-Aryan), where the only difference between the affirma- 
tive (2a) and the negative (2b) is the absence vs presence of the neg- 
ative marker ni. 

Sadri (Indo-Aryan: Jharkhand, Chhattisgarh, Odisha) 


(2 a. bujh-on-a b. ni bujh-on-a 
understand-PRS.1SG=NAR NEG understand-PRS.1SG=NAR 
‘I understand’ ‘I don't understand’ 


However, in asymmetric constructions other differences are also 
found. This can be seen in example (3) from Konkani. In the affirma- 
tive (3a) the finite verb is marked by the future-tense marker -tel, to 
which the PNG marker -5 '1sG.M' attaches. In contrast, in the nega- 
tive (3b) the main verb is a participle (i.e. non-finite) and marked as 
masculine singular (=c-9 ‘FUT.PART-M.SG’);* this form is then followed 
by the negative copula in the present tense, marked for 1st person, 
singular. In other words, the presence of nd ‘I am not’ in the negative 
construction is not the only difference between the two forms, as the 
marker of future tense is different in both, and the finite status of the 
main verb is also different in the affirmative and negative. 


4 The form -2 marks only masculine, singular; the 1st person, singular is marked by 
nasalisation. 
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Konkani (Indo-Aryan: Goa, Maharashtra, Karnataka, Kerala) 


(3 a. rig-tel-5 b. rig=c-2 ná 
enter-FUT-1SG.M enter-FUT.PTCP-M.SG NEG.COP.PRS.1SG 
‘| will enter’ ‘| will not enter’ 


Different types of asymmetry are possible, such as constructional asym- 
metry, as in example (3) from Konkani, where the respective affirma- 
tive and negative forms are different, but the individual categories of 
the paradigm as a whole are the same in both the affirmative and neg- 
ative, i.e. there is a positive and a negative form for the future tense 
in Konkani. There can also be paradigmatic asymmetry, e.g. in Kanna- 
da in example (4), where a distinction made between the future (4a) 
and the present (4b) tenses in the affirmative is lost in negation (4c). 
In short, the affirmative and negative paradigms are different with re- 
spect to the temporal distinctions they make, in addition to the asym- 
metric construction. 
Kannada (Dravidian: Karnataka) 


(4) Present affirmative Future affirmative 
a. nànu mad-utt-6ne b. nànu mad-uv-enu 
1SG do-PRS-1SG 1SG do-NPST-1SG 
‘I do’ ‘Iwill do’ 


Present / future negative 

c. nanu mad-uv-ud-"illa 
1SG do-NPST-NMLZ=NEG.COP 
‘Ido not /will not do’ 


In the present study we are primarily interested in constructional sym- 
metries/asymmetries and will not generally refer further to paradig- 
matic asymmetries, with the exception of the following type, which is 
directly related to the forms themselves: e.g. in the South Munda lan- 
guage Gutob, TAM markers have different values in the affirmative and 
negative paradigms, as shown in [tab. 1]. In other words, in this kind 
of paradigmatic asymmetry the value of the individual TAM markers 
differs with respect to polarity. Miestamo refers to this kind of system 
as "paradigmatic displacement" (2005, 55). 
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Tablei  Negation in Gutob (Munda: Odisha) (Vof forthcoming) 


Affirmative Negative 
middle active middle active 
FUT -log tu -a 17) 
PST -gV -0P -to 
HAB -to - 
IMP -a g -gV -0P 
OPT -eP -e? 


For example, in the future affirmative in Gutob in [tab. 1] we find the 
markers -lor in the middle voice and tu in the active, whereas the cor- 
responding tense markers in the negative are -a and zero (Ø), respec- 
tively. These last two markers are also found in the affirmative para- 
digm, however as markers of the affirmative imperative, not the future. 
Furthermore, the affirmative past middle marker -gV and active-voice 
marker -o? are also found in the negative paradigm, where they how- 
ever mark the negative imperative, not the past tense etc. 

Miestamo (2005) makes further distinctions in his study, such as 
the different types of asymmetric categories with respect to the finite 
status of the main or auxiliary verb (Type A/Fin), or types of grammat- 
ical categories involved in the asymmetry (Type A/Cat), such as TAM, 
evidentiality, voice, person and number etc. As ours is a preliminary 
typological study of negative constructions in South Asian languag- 
es and we are primarily interested in general patterns involving sym- 
metry vs asymmetry, these subcategories will only be referred to in 
passing where relevant, and finer distinctions such as these will not 
be dealt with here in any systematic fashion in the database. We hope 
to add these at a later date. 

Another basic distinction, made in both Miestamo (2005) and Bha- 
tia (1995) and which we also make here, is the type of negative mark- 
ing in a particular construction and its position with respect to the 
main verb. For example, if the negative marker is an affix/clitic, a fea- 
ture which we have borrowed from the GramBank consortium (fea- 
ture GB107 in our database)," we would like to know whether it is a 
prefix/proclitic or a suffix/enclitic.* Furthermore, if the negative mark- 


5 https://glottobank.org/#grambank. 


6 We consciously chose not to differentiate between affixes and clitics in our data- 
base, as the criteria for differentiating between these two categories, if a distinction 
is made at all in the respective studies, are not always clearly stated, and in many cas- 
es different authors working on the same language come to different conclusions with 
respect to their status. 

Further difficulties surfaced with respect to whether or not a negating element was a 
'particle' or a bound form (see further below in the main text). Often authors were some- 
what inconsistent in their treatment of these units as one type or another, so that we had 
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er is a suffix/enclitic, we note in the database whether it is word-final 
or if it is followed by markers of other categories, such as PNG. Indic- 
ative negation in Nepali is, for example, generally expressed as a suf- 
fix, following the verbal root and TAM markers but preceding (or fus- 
ing with) person/number markers. It is never word-final, except when 
it fuses with person/number marking (1st person singular) or with ze- 
ro person/number markers (3rd person, singular), hence we consid- 
er it to be a non-final negative suffix. This is shown in [tab. 2], adapt- 
ed from Matthews (1998, 94) for the past tense of the verb gar- 'do'. 


Table2  Past-tense negation in Nepali (Indo-Aryan: Nepal, Sikkim, Bhutan) (adapted 
from Matthews 1998, 94) 


Affirmative Negative 
1SG gar-& gar-i-nd 
2SG gar-i-s gar-i-na-s 
3SG.NF gar-y-o gar-e-na-& 
3SG.F gari gar-i-na- 
TPL gar-y-àw gar-e-n-éw 
2PL gar-y-aw gar-e-n-aw 
3PL.NF gar-e gar-e-na-n 
3PL.F gar-i-n gar-i-na-n 


* Written with a long <i>, however vowel length is not phonemic in Nepali and Matthews (1998, 
3-4) writes that there is no difference in pronunciation between <i> and <i>. 


In addition to the fact that person/number markers differ to some ex- 
tent between the affirmative and negative forms, which is not of con- 
cern here at the moment (but see further below), the negative suffix 
-na in Nepali fuses with person/number agreement in [tab. 2] in the 
1st person singular but not elsewhere. Furthermore, where it does 
not fuse with person/number marking, it is clear that person/number 
marking follows the negative suffix -na. We therefore take -na to be a 
non-final suffix in Nepali in standard negation. 

As noted above, we also include non-indicative negative marking 
in our database, although this is not standard negation, as we wished 
to differentiate between those languages with only one type of nega- 
tive marker and those with various markers based on mood. For exam- 
ple, the injunctive in Nepali is negated through the word-initial pre- 
fix na- (Matthews 1998, 197), not a suffix as in the indicative (cf. e.g. 
ma gar-u ‘may I do?’ vs ma na-gar-i ‘may I not do?’). In other words, 


to pick one ofthe alternatives, and in a few further cases we disagreed with an author's 
decision. Here we took a variety of factors into account, including the mobility of this 
unit in the sentence, including in ‘poetic’ or other special language (e.g. did it necessar- 
ily appear before or after the verb?), whether it could receive independent stress etc. 
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here we have two different strategies for marking negation based on 
differences in mood, one in the indicative (suffixed -na), one in the in- 
junctive (prefixed na-), both of which are encoded in the database. 

Other types of negative marking include suppletion and two mark- 
ing features taken from the GramBank consortium, namely inflect- 
ing words such as negative copulas (GB298, 299) and non-inflecting 
words, so-called ‘particles’.’ Particles can also differ from language to 
language. One type, the simplest of all types in our database, is found 
e.g. in Maithili, where negation is always marked by the particle noi 
(in formal and written styles, nahi), which is usually positioned before 
the verb.* There are no alternative forms based on TAM, no supple- 
tive negative copulas, and no asymmetrical constructions. Consider 
the examples in (5) and (6), from Yadav (1996, 305-6). 

Maithili (Indo-Aryan: Bihar, Nepal) 


(5 chora nai sut-eit əich 
boy NEG sleep-IMPF AUX.PRS.3NH 
‘The boy does not sleep’. 


(6 nei jo! 
NEG £O-IMP.2NH 
‘Don’t go!’ 


In other languages, the morphosyntax of negative particles can be 
somewhat more complex, even ignoring here differences in negative 
markers with respect to mood. For example, in the South Munda lan- 
guage Kharia, indicative negation is marked by the particle um, which 
generally appears directly before the predicate. In this case, the enclit- 
ic subject index in all persons except the 2nd person singular, non-hon- 


7 We deviate here somewhat from GramBank with respect to the definition of “in- 
flecting words”, which we consider to be all words that can either be used by them- 
selves as predicates, with finite verbs, or which e.g. can be used as light verbs to form 
acceptable predicates in a language requiring predicates to have a verbal element, 
i.e. a copula. This is independent of whether or not these units are marked for person, 
number, TAM etc. 

We also differ in our analysis in some cases from Miestamo (2005) with respect to 
whether an element is a negative auxiliary or a negative particle. For example, Mies- 
tamo (2005, 78-9) considers Kannada illa ‘am/is/are not’ to be a suffix (see also Mies- 
tamo 2005, 141 in this respect), however this is more an artefact of the writing system 
than an indication of the status of this unit as a suffix. illa is in fact the negative copula 
and enclitic in this position. Since we consider a ‘finite form’ in this study to be a word 
which can either be used as a main predicate in its own right or which functions as a 
light verb to make non-verbal predicates acceptable as main predicates, such as illa in 
Kannada, we view this form as finite. 


8 Although it can take other positions for stylistic purposes, such as in poetry (Ya- 
dav 1996, 387-8). 
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orific, obligatorily ‘moves’ away from its position following the pred- 
icate (7a) and attaches to the negative particle (7b), from Peterson 
(2011, 335). With the 2nd person singular, non-honorific, however, 
this index may optionally attach either to the predicate or to the neg- 
ative particle, as in (8), from Malhotra (1982, 285). 

Kharia (South Munda: Jharkhand, Chhattisgarh, Odisha) 


(7) a. ter[-e]-in b. umzin ter-e 
give=ACT.IRR=1SG NEG-1Sg give=ACT.IRR 
*| will give’. ‘| will not give’. 
(8) a. ubhron um-em dqe-na b. ubhron um de=na=m. 


these.days NEG=2SG Come=MID.IRR these.days NEG come=MID.IRR=2SG 
‘These days you do not come’. 


This type of variable marking with respect to person and number is 
found in our corpus only in Munda languages such as Kharia (South 
Munda), Santali, Mundari and Ho (North Munda), and only in one re- 
gion, namely Jharkhand, Chhattisgarh and Odisha, hence we did not 
encode this ‘movement’ in the database. If required, this can easily be 
added to the database at a later date. 

Negation is also marked periphrastically in many languages, gen- 
erally with a non-finite form of the main verb and a finite auxiliary. 
Examples of these are given in (9) from Konkani. While for the most 
part these periphrastic formations represent asymmetric construc- 
tions which differ from the affirmative forms in more than one way, 
the past tense in Konkani in (9a) is symmetric, as the only difference 
between affirmative and negative is the presence of the negative cop- 
ula in negation. In contrast, the future, present perfect and present 
tense are all asymmetric constructions, as the form of the main verb 
is different in the affirmative from that in the negative, in addition to 
the negative auxiliary (9b-9d).° Only in (9a) are both parts of the neg- 
ative predicate ‘finite’, whereas in all other negative forms only the 
copula is finite while the main verb is non-finite. 


9 There is a further asymmetry in the present perfect with respect to gender, which 
is expressed in the affirmative but not in the negative. Otherwise, gender is expressed 
either in both the affirmative and negative forms (past tense, future tense) or in nei- 
ther of these (present tense), so that there is no asymmetry in these other categories 
with respect to gender. This was not noted specifically in the database. 
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Konkani (Indo-Aryan: Maharashtra, Goa, Karnataka, Kerala) 


(9) a. Pasttense - main verbis finite in the negative 


rig-l-5 rig-l-5 nà 
enter-PST-1SG.M enter-PST-1SG.M NEG.COP.PRS.1SG 
‘I (M) entered’ ‘I (M) did not enter’ 
b. Future tense (= (3) above) - main verb is a participle in the negative 
rig-tel-5 rig-c-5 nà 
enter-FUT-1SG.M enter=FUT.PTCP-M.SG NEG.COP.PRS.1SG 
‘I (M) will enter’ ‘I (M) will not enter’ 
C. Present perfect - main verb is an infinitive in the negative 
rig-là rig-ük ná 
enter-PERF.1SG.M enter-INF NEG.COP.PRS.1SG 
‘I (M) have entered’ ‘I have not entered’ 
d. Presenttense - the main verb consists only ofthe stem in the negative” 
rig-tà rig-ná 
enter-IPFV.1SG enter=NEG.COP.PRS.1SG 
‘lenter’ | don’t enter’ 


* Although these two elements are written together as one word in the negative 
present tense, the verb stem can stand on its own as a separate word in some 
environments (including but not restricted to the imperative). We therefore 
consider the negated present tense to consist of the stem and the enclitic negative 
copula. 


With respect to mood, we also noted for each language whether differ- 
ent negative strategies were found based on any TAM categories, not 
just mood. For example, Bengali shows an asymmetry in the indicative 
in the present and past perfect: The Bengali indicative normally shows 
symmetry between affirmative and negative paradigms, the only differ- 
ence being the verb-final enclitic =na in the negative, as in (10a) vs (10b). 
However, the present and past perfect are asymmetric; here the same 
marker that is used to mark the present tense in the affirmative and 
negative combines with a different negative marker, =ni, to negate the 
present and past perfect, as shown in (11a-c). The two perfect catego- 
ries thus show constructional and paradigmatic asymmetry and, like the 
Gutob data (see [tab. 1]), are an example of paradigmatic displacement. 


(10) a. kor-i b. kor-i-na 
do-PRS.1 do-PRS.1=NEG 
‘I/we do’ ‘I/we do not do’ 

(11) a. kor-e-chi b. kor-e-chilam c. kor-i-ni 
do-LNK-PRS.PERF.1 do-LNK-PST.PERF.1 do-“PRS”.1=NEG.PERF 
‘I/we have done’ ‘I/we had done’ ‘I/we have/had not done’ 
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As a last entry in the database, we noted whether asymmetric nega- 
tive constructions were found at all in a language, in order to catch 
any possible types of asymmetry which may be found in a particular 
language but have not been treated systematically in the database. 
This is the case for example with the past tense in Nepali shown in 
[tab. 2] above, where PNG markers differ to some extent between the 
affirmative and negative forms. The past tense is indicated by one of 
the allomorphs -e / -i / -y, followed by the negative marker -na and PNG 
marking, which differs for some forms, such as gar-y-o [do-PST-3SG. 
NF] ‘he did’ vs gar-e-na-@ [do-PST-NEG-3SG.NF] ‘he did not do’ or the 
corresponding plurals gar-e [do-PST.3PL.NF] 'they did' vs gar-e-na-n 
[do-PST-NEG-3PL] ‘they did not do’ (cf. once again [tab. 2] above). 

Similar to Miestamo (2005, 58-9) we ignored minor phonological 
differences between affirmative and negative forms which were not 
connected to an identifiable function. For example, in Nepali the cop- 
ula ho ‘is’ has the negated form hoi-na ‘is not’, not the expected form 
*ho-na. However, this ‘suffix’ -i cannot be assigned any function, at 
least not from a synchronic perspective. As we are clearly not dealing 
here with suppletion, and as this -i has no identifiable function, this 
difference was not documented in the database. 

We also did not document asymmetries in our database that are not 
related to the verb phrase, such as variations in case marking between 
the affirmative and negative. While no such examples came to our at- 
tention, we made no systematic attempt to document such features. 

Summing up, we documented the following features with respect 
to negation: 

* whether negation can be marked by a particle, inflecting word 
(e.g. negative copula as an auxiliary) or a clitic/affix, as well as 
the position of this last type. Also, if this unit is a suffix/enclitic, 
whether this marker is word-final (e.g. Bengali) or non-word-fi- 
nal (e.g. Nepali); 

* whether copular or other verbs can be marked as negative 
through suppletion; 

* whether negation can be marked by an inflecting word together 
with a finite predicate, a participle, an infinitive or another type 
of (non-finite) verb form and whether this negative construction 
is asymmetric; 

* whether there are any different negation strategies based on 
TAM categories and if so, which and whether these are cases of 
asymmetric negation; 

* whether TAM markers with the same form have different TAM val- 
ues in affirmative and negative categories, and finally 

* whether there is any asymmetric negation in the language, in or- 
der to locate possible asymmetries not included above. 


The individual features documented are given in Appendix C. 
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3.3 'Zero Negation' in Dravidian 


Various South Dravidian languages such as Toda, Kannada and Tamil 
show a special negative form which appears to be unique crosslinguis- 
tically (Miestamo 2010; Pilot-Raichoor 2011). What makes this Dra- 
vidian construction crosslinguistically unique is that it consists only 
of the stem and PNG marking, with no further marking, including no 
overt negative marking. This is in contrast to other finite verb forms, 
where a TAM marker intervenes between the stem and PNG marking. 
This is shown for Literary Tamil and Old Kannada in [tab. 3] for the Ta- 
mil verb pati- 'learn' and the Old Kannada verb nod- 'see'. Thus, the 
negative is quite literally 'zero marked'; [tab. 4] illustrates this for Mod- 
ern Kannada for the verb mad- ‘do’.!° 


Table3  Thezero negative in comparison with affirmative finite forms in Literary 
Tamil and Old Kannada (from Pilot-Raichoor 2011, 269) 


Literary Tamil Old Kannada 
Root Tense Person Root Tense Person 
Past pati -tt- en,ayetc. — nod -id- em, ai etc. 
Future pati -pp- en,ayetc. nod -uv- em, ai etc. 
Negative pati -g- én,dyetc. — nod -g- em, di etc. 


Table4  Thezero negative in Modern Kannada (from Zydenbos 2020, 209) 


Singular Plural 
1SG mdd-enu 1SG mdd-evu 
2SG mad-i 2SG mad-iri 
3SG.M mdd-anu s 

i 3SG.HUM mdd-aru 
3SG.F mad-alu 
3SG.INAN mdd-adu 3SG.INAN mdd-avu 


While the origins of this construction were openly debated by special- 
ists in Dravidian linguistics in the 19th century, this discussion appears 
to have more-or-less ended soon thereafter, reappearing only briefly 
in Bloch (1935) and Master (1946) before once again disappearing 
from academic discourse. It was not until Pederson (1993) and Pilot 
(1997) that the topic was once again revived, with both authors com- 


10 Sridhar (1990, 227-8) assumes an -e/-a negative marker in Kannada, appearing be- 
tween the stem and PNG marking. However, as Pilot-Raichoor (2011, 276-7) shows, this 
interpretation is incorrect, as this -e/-a is part of the PNG marking. In fact, the PNG 
markers found in the zero negation construction in Modern Kannada are the same as 
those found in the future tense (compare e.g. the forms found in the table in Zydenbos 
2020, 65 for the future tense with those of the zero negative in Zydenbos 2020, 209). 
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ing to quite different conclusions with respect to its origin. With the 
appearance of Miestamo's (2005) monograph on negation, the top- 
ic has now become part of the larger typological discussion, and has 
since been dealt with in at least two further studies, Miestamo (2010) 
and Pilot-Raichoor (2011). 

Despite its unique status among the world's languages, the zero neg- 
ative is excluded from Miestamo's study (2005, 121), as it is not stand- 
ard negation, due to its somewhat marginal status in these languages. 
For example, Zydenbos (2020, 209) writes that the forms of the zero 
negative in Modern Kannada "are absolute negations, negating the oc- 
currence of an action or process categorically, without reference to a 
specific point in time". This is thus not the negation of a present, past, 
or future (etc.) action or state, but more of a categorical statement of 
the type "I have never done such a thing, I am not doing it now, and I 
will never do it" (Zydenbos 2020, 209; emphasis in the original). 

While we include a brief discussion of zero-negation here in order 
to present as many different types of negative constructions in South 
Asia as possible, it is only found in our sample in Kannada. Zero nega- 
tion is therefore presently not documented in the database. 


4 Results 


To visualise the data, we used SplitsTree4 (version 4.15.1) (Huson, 
Bryant 2006) to construct a NeighborNet network" and an unrooted 
UPGMA tree for the sake of comparison.” These are shown in [figs 1-2]. 
These figures are not offered as proof of any clusters in the region but 
are merely intended to help visualise the data with respect to nega- 
tion in these languages and to serve as a starting point for further dis- 
cussion, as these algorithms show a number of clusters - in fact, al- 
most the exact same clusters in both figures - suggesting that it will 
be worthwhile to take a closer look manually at the underlying simi- 


11 NeighborNet (Bryant, Moulton 2004) is often used in contact linguistics to por- 
tray the effects of language contact. In these networks, the length of branches corre- 
sponds directly to the degree of divergence or 'distance' between individual languag- 
es. Instead of trying to find an optimal tree-like format to portray similarities and dif- 
ferences between languages, NeighborNet suggests alternative trees to portray the 
possible paths which may be taken between two points when there are conflicting sig- 
nals in the data, as is commonly the case with language contact, but also with lan- 
guage isolates or languages which otherwise lack close relatives, or with data scarci- 
ty. Cf. Holman et al. 2011. 


12 UPGMA (Unweighted Pair Group Method with Arithmetic Mean), attributed to 
Sokal, Michener 1958 (cf. Wikipedia, "UPGMA", https: //en.wikipedia.org/wiki/ 
UPGMA#cite_note-). This clustering algorithm is a distance-based means of portraying 
similarities/differences between languages which assumes a constant rate of change 
for all languages. 
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larities of these clusters.” Therefore, in the following discussion we 
take a closer look at the clusters in the two figures and the typologi- 
cal features which motivate them. 

The following clusters list the member languages of the individual clus- 
ters which are found in both figures. Here cluster-internal differences, 
such as the somewhat different position of Malto, Kurukh and Gujarati on 
the left-hand side in both figures, will not be commented upon further, as 
we are only interested here in the general groupings and their features. 
The geographical distributions of these four clusters are illustrated in 
Appendix D. The respective cluster numbers are indicated in the figures. 


Gom Kankari, IA Marathi JA 


= DR 


Mato OR 


Maithili JA 
Buródi JA 


Awacfi A, Bhojpuri 1A, Marwari IA, Hindi 1A 
Suri IA 


NW Kolami OR Ku DR 


Figure1 ANeighborNet representation of negation in South Asian languages (25 features in 39 languages) 


Cluster 1 - This cluster is the most conspicuous in both figures. It con- 
sists of various Dravidian languages (Gadaba, Kannada, Malayalam, 
Kurukh and Malto), although not all (e.g. Telugu, Southeastern Kolami, 
Kuvi, Kui, and Dandami Maria are not included), and three Indo-Ary- 
an languages, namely Goan Konkani, Marathi and Gujarati, all three 
of which are spoken in western India. 


13 As Borin et al. (2021, 228) so aptly formulate it: "We see the function of the compu- 
tational tools [...] primarily as ‘filters’ helping the linguist to separate small amounts of 
wheat from large volumes of chaff, not by identifying the wheat directly, but by identify- 
ing those parts ofthe data where it is likely to hide and be found by manual inspection". 
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Cluster 2 - This rather heterogeneous cluster consists of the languages 
Odiya, Bengali and Darai (Indo-Aryan), the South Munda languages Juang 
and Sora, and the North Munda language Korku, spoken in central India. 
Cluster 3 - To this very large cluster belong the Indo-Aryan languages 
Awadhi, Bhojpuri, Sadri, Kashmiri, Marwari, Hindi, Maithili, and Bun- 
deli, spoken in western, central and eastern North India; the Munda 
languages Ho, Mundari, Santali and Kharia, spoken in Jharkhand in 
eastern central India; Gta? spoken considerably further to the south, 
along the border with Andhra Pradesh; the Iranian language Balochi, 
spoken in Pakistan; and the isolates Kusunda (central Nepal) and Ni- 
hali (western central India). 

Cluster 4 - In this cluster we find the Central Dravidian language 
Southeastern Kolami, the South Central Dravidian languages Telugu, 
Dandami Maria, Kuvi and Kui, and the South Munda language Bon- 
do/Remo, all spoken in southern eastern/eastern central India; the 
North(west) Dravidian language Brahui, spoken in Pakistan, and the 
Indo-Aryan language Nepali. 


Marathi JA 
Gadeba DR 


Mato DR 
Lu Kolami_DR 


Malayalam DR 
Kannada DR 
Goen Konkani 1A 


Gujarati JA 
Kunkh DR 


T Tdugu/DR 
Korku NM. a 
SE Kolam DR Nepali JA 
jangai JA — 
2 Bengali Ku OR 
Brahui_ DR 
Dandami_Maria DR 


Figure 2. A UPGMA representation of negation in South Asian languages (25 features in 39 languages) 


Of all 39 languages it is only the Central Dravidian language North- 
western Kolami which is in different clusters in the two figures: In 
Cluster 4 in [fig. 1] and Cluster 1 in [fig. 2]. For ease of presentation, it 
will be discussed together with Cluster 4 in 8 5, where its commonal- 
ities with Cluster 1 will also be highlighted. 
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With respect to the different negative markers, of the 39 languag- 
es in our sample 22 languages make use of affixes in negation in at 
least one category, 9 languages make use of a negative auxiliary verb 
in at least one category, and 23 of the languages in our corpus make 
use of a negative particle in at least one category. Affixes and nega- 
tive particles are thus very evenly distributed in our corpus (22 and 23 
languages, respectively), and both are more than twice as common as 
negative auxiliaries. These figures total more than the 39 languages 
in our sample as 13 languages combine different types of markers to 
some extent, e.g. Darai, which uses both the prefixal type as well as 
the negative particle type in different categories. Of these languages 
which combine different types, 11 use two different negative strate- 
gies while a further two - Marathi and Korku - use all three strategies. 

25 languages make use of only one of these three strategies (i.e. af- 
fix, auxiliary or particle) in negation:** nine languages use only affix- 
es, although some of these languages do make use of different affix 
types, such as Nepali, which has both prefixes and non-word-final suf- 
fixes, 13 languages use only negative particles, and three make ex- 
clusive use of negative auxiliaries. 

As will be discussed below, the distribution of the languages in 
our sample with respect to these three types is not entirely random. 
The most obvious example are the three languages which negate on- 
ly with negative auxiliaries, namely Malayalam and Kannada (both 
South Dravidian) and Konkani (Indo-Aryan), of which many speakers 
are bilingual with Kannada. Also, 10 of the 13 languages which make 
exclusive use of a negative particle are spoken in a more-or-less con- 
tiguous area from Rajasthan (Marwari) via central North India (Hin- 
di) to Bihar and Jharkhand (several Indo-Aryan and Munda languag- 
es), with the other three far to the north (Kashmiri), southwest (Nihali) 
or west (Balochi). Similarly, with three exceptions, namely Nepali, Ku- 
sunda and Brahui, the other six languages which negate exclusively 
through affixes are all found in central and eastern India. Clearly, ge- 
nealogical tendencies and areal pressure both play a role in the dis- 
tribution of these features. 

The significance of the data which is visualised in [figs 1-2], and 
above all the features behind these clusters, are discussed in detail in 
$ 5, where we show which areal patterns are most likely due to lan- 
guage contact, and suggest, where possible, what type of language 
contact in the past has led to the observed results. 


14 If we include Odiya here, for which we could not be sure that it only has one cat- 
egory, then there are 26. 
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5 Analysis 


In the following we discuss each of the individual clusters with re- 
spect to the predominant negating strategies documented in the da- 
tabase and what information this provides us with respect to histori- 
cal language contact. 


5.1 Cluster 1 


What is most notable about this cluster is that it consists of the three 
western Indo-Aryan languages Konkani, Marathi and Gujarati, and 
just to the south of these the South Dravidian languages Kannada 
and Malayalam. While Gadaba, Kurukh and Malto are also Dravidi- 
an languages, Gadaba is spoken in eastern Andhra Pradesh and Ku- 
rukh and Malto are spoken much further north and east, primarily in 
western and northeastern Jharkhand, respectively. We therefore be- 
gin here with the western Indo-Aryan and South Dravidian languag- 
es in this cluster. 

The most notable aspect of this cluster in [fig. 1] is the exposed po- 
sition of Konkani, Kannada and Malayalam. The reason for this like- 
ly lies in the fact that these languages make exclusive use of nega- 
tive auxiliaries (GB298) generally deriving from a suppletive negative 
copula (SA075, SA076). Negation here thus consists of finite (SA078) 
or non-finite (SA079, SA080, SA081) forms of the lexical predicate, of 
which most are asymmetric constructions (SA079a, SA080a, SA081a, 
SA086). Also, there are different negating strategies in all three lan- 
guages for TAM categories (SA083, SA084), and again generally asym- 
metric constructions (SA083a, SA084a). This is especially true of Kon- 
kani and Kannada. 

Consider the data in [tabs 5-6], which illustrate the affirmative and 
negative categories in the indicative and the imperative in both of 
these languages. The form of the lexical predicate in negation (i.e. in- 
finitive, participle, finite form) is given in bold print directly above the 
corresponding negative verb form in both tables. 

Although there are other Indo-Aryan languages with negative cop- 
ulas, itis much less common elsewhere in Indo-Aryan to use these as 
a major negative strategy than in Konkani, and to a much lesser ex- 
tent in Marathi (see further below), and Konkani is one of only three 
languages to make exclusive use of negative auxiliaries in negation - 
the other two being, crucially, Kannada and Malayalam.** Otherwise, 


15 Itisisalso found in some dialects of Sadri (first author's own data), but not in the 
standard dialect, from which the Sadri data for this study were taken. Note that Miran- 
da (2003, 760) gives a short list of examples of Kannada influence on Konkani, one of 
which is negation, although very brief and rather vague: "Non-finite forms of the verb 
are used in the various tense-aspect forms of negative sentences". 
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negative particles and affixes are generally used in Indo-Aryan. It is 
therefore clear that Konkani has developed the negative patterns il- 
lustrated in [tab. 5] through contact with Kannada. 


Table5 Affirmative and negative strategies in Goan Konkani 
(based on Almeida 2004, 98-9 and examples throughout that book)!5 


Affirmative form Negative form 
Simple finite verb plus negative copula 
Simple Past rig-l-5 [enter-PST-1SG.M] rig-l-5 nà 
Stem plus negative copula 
Present rig-tà [enter-IPFV.1SG] rig-ná 
Pastimperfective ^ rig-ta-I-5 [enter-IPFV-PST-1SG.M] rig=nasl5 


Future participle (=c2) plus negative copula 
Future rig-tel-5 [enter-FUT-1SG.M] rig=co nà 

Infinitive 2 (-dik) plus negative copula 
Present perfect rig-I6 [enter-PERF.1SG.M] rig-ük nà 
Past Perfect rig-lel-5 [enter-PST.PERF-1SG.M] rig-ük nasl5 


Infinitive 1 (-d) plus specialised form 
of negative copula 


Imperative rig rig-G naka 


Table6 Affirmative and negative strategies in Standard Kannada 
(adapted from Zydenbos 2020, 149-50, 160, 179-82, 184-9) for mad(u) ‘do’ 


Affirmative form Negative form 


Verbal noun -uvud(u)' plus negative locative copula 

illa 
Present mdd-utt-éne [do-PRS-1SG] mdd-uvud=illa 
Future mdd-uv-enu [do-FUT-1SG] 

Present participle -utt plus negative locative copula 
Present continuous mdd-utt=iddéne [do-PRS-PRS.COP.1SG] mad-utt-illa 

Infinitive in -al plus negative locative copula 
Simple past mdd-id-enu [do-PsT-1sc] mad-al-illa 

Sequential converb' plus negative locative copula 
Present perfect mdd-i=ddéne [do-cvB-PRs.coP1sc] ^ mad-i-lla (*-i-i>-i) 

Infinitive in -a plus beda 'is not needed/wanted' 


Imperative máàdu màad-a=bedaî" 


i This form consists of the non-past tense marker -uv and the nominaliser -ad(u)/-ud(u). 
ii Referred to in Zydenbos (2020) as the “gerund” 


iii bēda is written together with the preceding infinitive, however since it can also stand alone, we consider it 
here to be enclitic. 


16 The present tense is indicated through a lack of overt tense marking following the 
imperfective marker -ta, to which nasalisation (denoting the 1st person singular) then 
directly attaches. nd and nasl5 in [tab. 5] are the forms of the 1st person, singular (mas- 
culine) of the negative copula in the present and past tenses, respectively. 
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Although we argue that Konkani has developed these complex nega- 
tive strategies through contact with Kannada, even a brief glance at 
the data in [tabs 5-6] shows that the Konkani constructions are not sim- 
ply direct borrowings from Kannada. To begin with, all negative con- 
structions in Konkani are based on Indo-Aryan morphs, not morphs 
borrowed from Kannada. Instead, what has been borrowed here is 
the general pattern of almost entirely asymmetric negative construc- 
tions which make use of a negative auxiliary generally deriving from 
the negative copula. 'Borrowing' of this type, as opposed e.g. to that 
of simple lexical items, is only possible with speakers who are fluent 
in both languages. This speaks for a prolonged period of stable bilin- 
gualism between Konkani and Kannada, which is also grounded in Kan- 
nada's and Konkani's historical relationship (e.g. Miranda 2003, 760). 

Furthermore, despite all similarities, there is no exact fit between 
the individual categories in both languages, which again implies that 
the respective speakers will have been fluent in both languages and 
will have been able to 'borrow' structures in a way so as to maintain 
the TAM distinctions which both languages otherwise show. In oth- 
er words, while the overarching pattern which was copied into Kon- 
kani was one of predominantly asymmetric negation with a special- 
ised negative auxiliary, this occurred in Konkani in a way which was 
in synch with the overall system of that language and not just a copy 
of the Kannada structures. 

For example, Kannada shows a paradigmatic asymmetry in which 
the present and future distinction found in the affirmative is lacking 
in the negative, whereas Konkani shows no such TAM paradigmatic 
asymmetries, and both the present and the future in Konkani are ne- 
gated through constructions, neither of which is found in that form 
in Kannada. Also, while the infinitive followed by an auxiliary is the 
negative strategy for the perfect in Konkani, it negates the past tense 
in Kannada. 

In fact, the Konkani system is morphologically even more complex 
than the Kannada system which served as a model for its negative pat- 
terns, further showing how the new structures were integrated into 
the existing grammatical structures of Konkani. With respect to the af- 
firmative categories, Konkani distinguishes person, number and gen- 
der in all persons in most TAM categories; in Kannada these are re- 
stricted in the affirmative to the 3rd persons. Consider the data for 
the Konkani and Kannada affirmative future in [tabs 7-8].!” 


17 The alternative forms in Kannada are not related to gender distinctions but are 
free or regional variants. 
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Table 7 Affirmative future in Konkani (ker- ‘do’, from Almeida 2004, 77) 


Singular Plural 
m f n m f n 
1 ker-tel-5 ker-tel-i ker-tel-€ ker-tel-€ ker-tel-yo ker-tel-i 
2 ker-tel-2 ker-tel-i ker-tel-€ ker-tel-E ker-tel-yo ker-tel-T 
3 ker-tel-2 ker-tel-i ker-tel-€ ker-tol-€ ker-tel-yo ker-tel-T 


Table8 Affirmative future in Kannada (mad(u) ‘do’, from Zydenbos 2020, 66) 


Person Gender Singular Gender Plural 
1 mdd-uv-enu / màd-uv-e mdd-uv-evu 
2 mdd-uv-e /mad-uv-i mad-uv-iri 
3 M mdd-uv-anu /mad-uv-a HUM mdd-uv-are 
F mdd-uv-alu 
NHUM mdd-uv-adu / màd-uv-udu NHUM mdd-uv-uvu / màd-uv-avu 


The differences with respect to morphological complexity in the neg- 
ative are even greater. In Kannada, the entire affirmative paradigm is 
negated by the invariable form mad-uvud=illa (cf. [tab. 6] above), con- 
sisting of the non-present verbal noun mad-uvud(u) and the invaria- 
ble negative copula illa. By contrast, in Konkani all PNG distinctions 
are retained for all persons in the negative (except in the present per- 
fect), which e.g. in the case of the future consists of the future parti- 
ciple in 2c, marked for gender and number (cf. [tab. 9]). PNG marking 
is then marked on the negative auxiliary which follows the participle. 
The forms of the negative auxiliary are given in [tab. 10]. Thus ker=c-9 
nà ‘I (M) will not go’ etc. 


Table9  Thegender/number forms of the future participle in Konkani 


Singular Plural 
M F N M F N 


ker-c-2 ker-c-i ker=c-É ker=c-£ ker=c-yo ker-c-i 


Table 10 The present-tense negative auxiliary in Konkani (Almeida 2004, 98) 


Person Singular Plural 
1 nà nant 
2 na nant 
3 na nant 


Thus, in Konkani no TAM distinctions are lost in the negative of the 
type found with the Kannada present/future-distinction in [tab. 6], and 
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all affirmative and negative predicates are marked for the same PNG 
categories, with the exception of the present perfect, where gender 
distinctions are lost in the negative (cf. again the discussion of exam- 
ple (3) in 8 3 above). 

Nowhere else in Indo-Aryan is the copying of a general negating 
strategy from one language family into another as pervasive as it is 
along the Konkani-Kannada border, suggesting that this area of con- 
tact has been shaped by centuries of highly stable bilingual contact, 
with the Indo-Aryan-speaking regions slowly but surely progressing 
southwards and Dravidian-speaking areas gradually receding before 
them. Even today we find large numbers of Konkani speakers in Kar- 
nataka, the state whose official language is Kannada, with syntactic 
borrowings from Kannada in the Konkani of this region (cf. e.g. Nad- 
karni 1975). Although Konkani speakers here constitute a minority in 
most areas, it is nevertheless noteworthy that most Konkani speakers 
live in Karnataka, and it is only the state of Goa where Konkani speak- 
ers predominate (Almeida 1989, 5-7). This type of situation between 
Konkani and Kannada has thus likely existed for several centuries 
or perhaps even millennia, although slowly progressing southwards. 

Thus, in our view, only a prolonged period of intense bilingual lan- 
guage contact between Konkani and Kannada can account for the de- 
velopment of this type of complex negation in Konkani (cf. Trudgill 
2011), although it is not yet possible to say whether large numbers 
of speakers of both languages learned the other language or whether 
only one of these two groups was bilingual. We know that the Indo-Ar- 
yan-speaking area has been steadily progressing southwards along the 
west coast since Vedic times (cf. e.g. the discussion of Maharashtri- 
an place names in Southworth 2005, 288-321), however this could be 
due to extensive bilingualism by Indo-Aryan L1 speakers, by Dravidi- 
an L1 speakers, or by both groups. Future research is required here. 

The main differences between Marathi and the Konkani-Kannada 
pattern are that Marathi has all three different types of negative mark- 
ers, i.e. prefix, particle and also a negative auxiliary (GB107, GB298, 
GB299; SA071, SA072, SA074, SA075, SA076), whereas Konkani and 
Kannada only have negative auxiliaries. Marathi also has periphras- 
tic negative constructions (but fewer than Konkani), some of which 
are asymmetric (SA079, SA080, SA080a), as well as different nega- 
tive strategies based on TAM, which are asymmetric (SA082, SA083, 
SA083a, SA084, SA084a, SA086). The situation in Gujarati is sche- 
matically similar to that in Marathi, although with some differences. 


18 Further evidence for this type of contact scenario is cited in Peterson 2022, e.g. 
correspondences in the imperative paradigm in both Konkani and Kannada, with iden- 
tical PNG marking for the 1st person singular and the 3rd person singular and plural, 
which is otherwise at the very least uncommon in South Asia. Here as well, the morphs 
in Konkani have not been directly borrowed from Kannada. 
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On the other hand, Malto, Kurukh and Gadaba are quite unlike the 
Konkani-Kannada type with respect to periphrastic negative construc- 
tions. The first two share with Cluster 1 the presence of a negative 
auxiliary (in addition to a negative affix) as well as different negative 
strategies for some TAM categories. Similar comments hold for Gada- 
ba, which however does not have a negative auxiliary. It does however 
also make use of non-final and word-final suffixes in negation, like the 
languages in Clusters 2 and 4 (see below), with Gadaba bordering on 
the southern edge of Cluster 2 and on the northern edge of Cluster 4. 

However, schematically Gadaba shares a number of characteristics 
with Cluster 1 languages, such as the use of different negative strat- 
egies for tense-aspect and mood, both of which are asymmetrical. It 
also makes use of a negated copula with the infinitive to negate the 
past tense, like Kannada [tab. e] or the perfect of Konkani [tab. 5], how- 
ever this is not a suppletive form, as it is in those two languages. Its 
status within this cluster is thus somewhat unclear. 

In contrast, the similarities of both Malto and Kurukh with the oth- 
er members of this cluster are likely coincidental. While they do share 
many features with other Cluster 1 languages such as negative auxiliary 
verbs (GB298), suppletive negative copular verbs (SA076), a construc- 
tion with a negative auxiliary and a participle in an asymmetric construc- 
tion (SA079, SA079a), and different negative markers based on tense-as- 
pect (SA083, SA083a) and mood (SA084), at least at present we have 
no reason to assume that this is due to a family bias with the South Dra- 
vidian languages Kannada and Malayalam or with the Central Dravidi- 
an language Gadaba, nor to areal pressure, as the nearest Indo-Aryan 
language, Marathi, is spoken at a considerable distance from these two. 


5.2 Cluster 2 


Cluster 2 is quite heterogeneous with respect to the geographical loca- 
tion of languages. Some, such as Odiya and Bengali, are direct neigh- 
bours and very closely related, hence the similarities between these 
two languages are to be expected. These languages both have nega- 
tive word-final suffixes, with a particle also found in Bengali (GB107, 
SA072, SA073). Both also have suppletive negative copulas (SA076) 
and different negative strategies for certain TA categories (SA082, 
SA083), which is asymmetric in Bengali (SA083a). 

Sora and Juang are quite different with respect to negative marking. 
In both languages this marker can be a prefix (GB107, SA071), but in 
Sora it can also be expressed through a suffix, both in word-final and 
non-word-final position (SA072, 073, 074), similar to Gadaba in Clus- 
ter 1 above, whose status in that group is unclear. In Sora but not in 
Juang we also find suppletive negative copulas (SA076), while in both 
we find some periphrastic constructions, including asymmetric ones, 
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with slight differences between the two languages (SA082-086). Juang 
is also one of three languages in our database which show paradigmatic 
displacement (SA085), the other two being Gutob, discussed in [tab. 1] 
above, and the present and past perfect in Bengali (examples 10-11). 

Korku and Darai do not really fit into this cluster with respect to the 
form of the negative marker. In Korku it can take the form of a suffix, 
an auxiliary or a particle, whereas in Darai it can be a prefix or a par- 
ticle. What they share with the other languages appears to be mostly 
the presence of different negative marking strategies with respect to 
TAM categories, at least some of which are asymmetric, although this 
also holds for many languages from other clusters as well. 

In sum, although the negative strategies in Sora and Juang may to 
some extent have been affected by language contact with Odiya, this 
presumed influence would appear to be quite weak at best. On the oth- 
er hand, areal influence can be entirely ruled out with respect to Korku 
and Darai on geographical grounds. We therefore do not view mem- 
bership in this cluster as due to areal influence or family bias, with the 
obvious exception of Odiya and Bengali, but most likely as coinciden- 
tal similarities among these languages. Also, as noted above for Clus- 
ter 1, Gadaba shares with most members of this cluster the fact that 
it has suffixal negative markers, although here as well similarities to 
this cluster are rather weak and are stronger with Cluster 4 (below). 


5.3 Cluster 3 


Cluster 3 for the most part consists of Indo-Aryan and Munda lan- 
guages spoken in a more-or-less contiguous area stretching from Ra- 
jasthan through Uttar Pradesh to Bihar and Jharkhand. In this cluster 
we also find the South Munda language Gta?, spoken along the bor- 
der between Odisha and Andhra Pradesh, Kashmiri, and the Iranian 
language Balochi. What all languages in this cluster other than Gta? 
have in common is that they possess a non-inflecting negative parti- 
cle. In addition to this, Kashmiri also has a negative suffix. In contrast, 
Gta? makes exclusive use of a negative prefix. 

This is the only cluster in our present database where we find lan- 
guages where a negative particle is the only negative-marking strate- 
gy (all except Kashmiri and Gta?). Furthermore, apart from Kashmiri, 
all Indo-Aryan languages of this cluster belong to the so-called 'Hin- 
di Belt'. Two of these, Bundeli and Maithili, are also the two languag- 
es with the simplest negative strategy found in our database, with a 
negative particle and no further positive values in the database, in- 
cluding no negative copulas, no asymmetric constructions and no dif- 
ferent strategies based on TAM. 

All other languages in this cluster have different negative strate- 
gies based on mood (SA082, SA084); these non-indicative strategies 
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are asymmetric in all Munda languages other than Ho and in Balochi 
and Nihali (the latter of which also has an asymmetric negative con- 
struction based on tense/aspect, SA083, SA083a), but symmetric in all 
Indo-Aryan languages with a different non-indicative negative mark- 
er (SA084a). Finally, the Munda languages in this cluster, the neigh- 
bouring Indo-Aryan language Sadri, and Balochi all have suppletive 
negative copulas (SA076). 

Despite having the same negative-marking strategies as Munda lan- 
guages, Balochi's similarity to these languages is clearly coincidental, 
as it is spoken far to the west in Pakistan. It is also genealogically too 
distant from Indo-Aryan to be due to family bias. Kashmiri, although 
Indo-Aryan, also belongs to a different group than the 'Hindi Belt' lan- 
guages and therefore also likely represents an independent retention 
of this earlier negational strategy (see below). 

The isolates Nihali and Kusunda are probably only found in this 
cluster due to chance similarities in their negative strategies. They 
are not related genealogically to either Munda or Indo-Aryan, and ar- 
eal pressure can most likely be ruled out for both. 

Despite its relative geographical proximity to the eastern languages 
of this cluster, the South Munda language Gta? is quite different from 
the other languages in this cluster in that it does not have a negative 
particle, the main defining structural characteristic of this cluster, al- 
though it does have a suppletive negative copula and different nega- 
tive strategies based on mood. As especially this last feature is very 
common, Gta?'s inclusion in this cluster is therefore almost certainly 
due to chance similarities and not to areal or genealogical pressure. 


Family Bias, Areal Pressure, or a Bit of Both? 


As discussed in Peterson 2022, the eastern part of the 'Hindi Belt' re- 
gion consists of Indo-Aryan languages which display considerable sim- 
plifications in comparison with western Indo-Aryan languages. Peter- 
son argues that these simplifications resulted when large numbers 
of Indo-Aryan speakers entered eastern India, where their languag- 
es quickly became the lingua franca of the region. As argued there, 
this will have resulted in large numbers of speakers - in many regions 
perhaps a considerable majority of the speakers - being adult learn- 
ers of Indo-Aryan, which gave rise to a dramatic amount of morpho- 
logical simplification in eastern Indo-Aryan. It is interesting to note 
that the Munda languages of this contact area are also found in this 
cluster. This suggests that contact may be a factor behind the exist- 
ence of this cluster. 

Nevertheless, this is primarily a case of family bias, as the Indo-Ar- 
yan languages of this cluster have retained the features from older 
stages of these languages, going back to OIA, with few negative par- 
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ticles, differing with respect to mood, and no structural asymmetries. 
Thus, while this 'simple' negative pattern may be expected in a sit- 
uation where a large percentage of speakers are adult L2 speakers, 
family bias alone will suffice to explain the 'Hindi Belt' members of 
this cluster, especially since many of these are spoken further to the 
west, where the massive simplifications noted in Peterson (forthcom- 
ing) for eastern Indo-Aryan did not take place. Thus, this 'simple' neg- 
ative strategy is compatible with the predominant negational pattern 
of this cluster, but is not likely its primary motivating factor. 

What remains to be accounted for is the status of the Munda mem- 
bers of this cluster. As Jenny, Weber, Weymuth (2015, 107) note, it is 
extremely difficult to posit any negative-marking strategy for Aus- 
tro-Asiatic, as the negative constructions in that family are so diverse. 
A bias only for the Munda group is, however, equally difficult as it is 
only in the Munda languages of this group that negation is marked ex- 
clusively by means of a particle. While Juang has a negative particle, 
it also negates through prefixes. Gta? on the other hand negates only 
through prefixes while Sora negates with both types of affixes. Korku 
negates with suffixes, but it also has an auxiliary negative verb and a 
negative particle. As the Munda languages in our sample are found 
in three of the four clusters determined by both algorithms, this sug- 
gests that Munda languages in general cluster with their linguistic 
neighbours, regardless of genealogical relationships. We therefore 
assume here that negative-marking strategies in these Munda lan- 
guages arose through contact with Indo-Aryan. 

In sum, the predominant negative-marking strategy in the Indo-Ar- 
yan languages of this cluster is due to family bias, while the similar 
negative-marking strategies of the Munda languages in this cluster is 
likely due to contact with the eastern 'Hindi Belt' languages. 


5.4 Cluster 4 


Cluster 4 largely consists of Central and South Central Dravidian lan- 
guages, but also the South Munda language Bondo-Remo, spoken 
along the Odisha-Andhra Pradesh border, the northwestern Dravid- 
ian language Brahui, and Nepali. In addition, in [fig. 1] Northwestern 
Kolami also belongs to this cluster, although it is in Cluster 1 in [fig. 2]. 

The members of this cluster all share various structural features 
with respect to negation. First, all mark negation through an affix 
(GB107), including Northwestern Kolami, which in all languages ex- 
cept Bondo-Remo is a non-final suffix (SA072, SA073). The data in Bon- 


19 Cf. also Borin et al. (2021) with respect to the clustering of Munda languages in 
general. 
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do-Remo were unfortunately not explicit enough so that we have no 
entries for features SA072 or SA073 for this language. Furthermore, 
all languages show some form of asymmetric negative construction, 
which in five out of eight languages in this cluster (and also in North- 
western Kolami) is due to asymmetric negative constructions based 
on mood distinctions. Nepali shows a further asymmetry in the past 
tense where the PNG suffixes in negation differ to some extent from 
those in the affirmative. 

With the obvious exceptions of Nepali and Brahui, this cluster ap- 
pears to represent an older (South) Central Dravidian negative-mark- 
ing strategy which has survived in these languages up to the present. 
The only other language in this cluster in both [figs 1-2] is the South 
Munda language Bondo-Remo. While this language may well belong 
to this cluster due to areal pressure from the neighbouring Dravidian 
languages, this is not entirely clear, as we presently have no data for 
three of the critical features of this cluster. 

It is especially noteworthy that North Dravidian Brahui clusters in 
this group with (South) Central Dravidian languages, which are spo- 
ken at a great distance from the Brahui-speaking region, but does 
not cluster with Balochi, which virtually surrounds the Brahui-speak- 
ing area and which most Brahui also speak.?? Whether this similari- 
ty is due to family bias or to chance definitely warrants further study. 

With respect to the ambiguous status of Northwestern Kolami in 
[fig. 1] (Cluster 4) and [fig. 2] (Cluster 1), it is worth noting that this lan- 
guage is located in the border region of Marathi and the Dravidian lan- 
guages of central India and shows features common to both Clusters 
1 and 4. Like all other members of Cluster 4 for which we have the 
respective data, Northwestern Kolami has non-final negative suffixes, 
shows some form of asymmetry and has distinctive mood-based neg- 
ative marking. Like most Cluster 1 languages, however, Northwest- 
ern Kolami also negates with an inflecting word which derives from 
the copula, has suppletive negative copular forms, and again, distinc- 
tive negative strategies based on mood. Like some other languages of 
Cluster 1 it also makes use of both prefixes and suffixes in negation. 
Hence its different status in [figs 1-2].7* 

Thus, while a family bias is likely behind the membership of many 
Dravidian languages in this cluster, we also see some likely signs of 
areal pressure at the fringes of this area, with Northwestern Kolami 
oscillating between this cluster and Cluster 1. 


20 We are grateful to an anonymous reviewer for calling this to our attention. 


21 While this strongly suggests a high degree of long-term bilingualism between 
Northwestern Kolami and Marathi, the Ethnologue (https://www.ethnologue.com/ 
language/kfb) claims that Northwestern Kolami speakers have limited proficiency in 
Marathi (cf. Eberhard, Simons, Fennig 2021). Further work is required. 
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Summarising our results of the present section, each of the respec- 
tive clusters has a 'core group' of languages which share, for the most 
part, a preferred type or preferred types of negative marking as well 
as the presence or lack of periphrastic negative constructions and/or 
constructional asymmetries. 

The languages of Cluster 1 most clearly illustrate that negative 
structures in one language or group of languages, here Konkani (and 
to a much lesser extent Marathi) have been motivated by structures in 
a neighbouring language, in this case Kannada, which has very similar 
structures to those in Malayalam. As the Konkani structures are cer- 
tainly innovations, while both South Dravidian languages involved show 
similar structures, we assume that long-term community bilingualism 
lies behind the imitation of South Dravidian structures in Konkani with 
native Indo-Aryan morphology. This is in line with the arguments in Pe- 
terson 2022, who finds signs of long-term bilingualism for these lan- 
guages with respect to other features, especially of the nominal system. 

In Cluster 3 the so-called 'Hindi Belt' languages (and a few oth- 
er Indo-European languages) have retained an older system of nega- 
tion with a small number of negative particles, based on mood, and no 
constructional asymmetries, i.e. this is a clear example of family bias. 
However, the Munda languages of this cluster are also quite similar 
with respect to negating strategies. As Munda languages in general 
tend to cluster with their geographical neighbours and not with other 
Munda languages further afield, and as it is only the Munda languag- 
es of this group which negate exclusively with a negative particle, we 
assume that these Munda languages have developed this marking pat- 
tern though contact with Indo-Aryan. 

Signs of areal pressure are also found with languages which are 
at the fringes of their respective areas, e.g. Northwestern Kolami, 
which shares features of Clusters 1 and 4, between which it is locat- 
ed, as well as perhaps Gadaba from Cluster 1, which shares some fea- 
tures with Clusters 2 and 4. But both [figs 1-2] include in each cluster 
languages which are genealogically and geographically quite far re- 
moved from the other languages of their respective clusters, show- 
ing that the same negative strategies can arise and/or be preserved 
independently of their linguistic neighbours, despite all genealogical 
and areal pressure. 


6 Summary and Outlook 


In this study we present a first typology of negative-marking strategies 
in South Asian languages, based on a database of 25 structural features 
from 39 languages belonging to Indo-European (Indo-Aryan and Irani- 
an), Dravidian, and Munda families, as well as the two isolates Nihali 
and Kusunda. The features documented for each language are large- 
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ly a subset of those in Miestamo (2005), although we deviate here oc- 
casionally from that study as our goals differ somewhat from Miesta- 
mo's. The features we have documented in our database include the 
form ofthe negative marker, the relation of the negative construction to 
the corresponding affirmative construction, e.g. whether the negative 
construction is symmetric or asymmetric, as well as whether there are 
alternative negative constructions based on TAM or other categories. 

In a second step, we make use of two different algorithms to vis- 
ualise the patterns in the data in a first attempt to determine which 
languages most likely cluster together and why. Here we discuss the 
relevant features of the languages in each of the clusters suggested 
by the two algorithms as well as their genealogical and geographical 
distributions to determine whether the clustering is due to family bi- 
as, areal pressure, both, or merely due to chance similarity. 

The data includes examples for all four of the scenarios just men- 
tioned; e.g. the Dravidian languages of Cluster 4 are likely an exam- 
ple of family bias, as this cluster has a clear regional focus in Andhra 
Pradesh, Telangana, and southern Chhattisgarh and Odisha, while 
other languages in this cluster are certainly due to chance similari- 
ties, such as Nepali. 

Cluster 1, on the other hand, provides the strongest example of con- 
tact-induced negative marking in our sample. Here, the traditional In- 
do-Aryan negative marking system, with a small number of negative 
particles based on mood distinctions and no asymmetries, has been 
entirely remodelled in Konkani along the lines of the negative-mark- 
ing strategies found in its Dravidian neighbour Kannada. This is a 
strong indication that this is due to a situation of long-term, stable bi- 
lingualism between Konkani and Kannada which has resulted in the 
copying of complex negative paradigms from Kannada into Konkani 
(cf. also Peterson 2022). 

Other clusters, such as Cluster 3, involve a combination of both 
tendencies. The languages of this cluster, most of which belong to 
the ‘Hindi Belt’, generally make exclusive use of negative particles 
to express negation, a clear case of family bias. However, this mark- 
ing pattern includes not only the 'Hindi Belt' languages but also the 
neighbouring North and South Munda languages of Jharkhand, Chhat- 
tisgarh and Odisha, suggesting that areal pressure is the motivating 
factor behind the inclusion of these latter languages in this cluster. In 
fact, as Munda languages are found in three of the four clusters iden- 
tified by both algorithms, this suggests that perhaps all Munda lan- 
guages have been heavily influenced by their neighbours with respect 
to negative marking and do not show any family bias (cf. also Jenny, 
Weber, Weymuth 2015), in line with the findings in Borin et al. (2021) 
for this family with respect to other features. 

However, not all languages fit neatly into one of these categories 
with respect to negative marking. To begin with, we find zero-nega- 
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tive marking in Kannada (and elsewhere in South Dravidian), a high- 
ly archaic - and crosslinguistically unique - form of negative marking. 
We also find paradigmatic displacement to differing degrees in three 
eastern languages, most notably in Gutob (South Munda), but also in 
Juang (South Munda) and Bengali (Indo-Aryan), although at present 
this restriction to eastern India appears to be coincidental. To these we 
can also add the use of negative particles in Nihali; despite its mem- 
bership in Cluster 3, this appears to be a chance similarity, as Nihali 
is an isolate and is geographically quite distant from most other lan- 
guages of this Cluster, so that neither family bias nor areal pressure 
seems likely at present. 

There is still much to discover with respect to negation in South 
Asian languages and the present study can only be seen as a first step 
towards an exhaustive typology of these languages in this respect. 
With only 39 languages, and currently still without any languages of 
the Trans-Himalayan and Tai-Kadai families or the island languages 
such as Sinhala, Dhivehi etc., our database is still quite small. There 
are thus still likely many types of negative-marking strategies which 
we have not yet found. In addition, as more languages are included in 
the sample, we anticipate that further genealogical and areal tenden- 
cies will also become clearer. 

Nevertheless, despite its size our database has already highlight- 
ed numerous examples of both genealogical and areal tendencies, as 
well as a number of "linguistic loners' with respect to negation. Since 
both new languages and new features can easily be added to the da- 
tabase, this provides a solid base for future work on all aspects of ne- 
gation for the languages of this region. 


Acknowledgements 


The authors wish to thank Søren Wichmann for his valuable com- 
ments on an earlier version of this article, Govind Mopkar for some 
last-minute questions on Konkani, as well as many useful suggestions 
from two anonymous reviewers. It of course goes without saying that 
we alone are responsible for all errors and misconceptions which the 
article may still contain. 


50 


Bhasha e-ISSN 2785-5953 
1, 1, 2022, 17-62 


John Peterson, Lennart Chevallier 
Towards a Typology of Negation in South Asian Languages 


Abbreviations 

1,2,3 person 

ACT active 

AUX auxiliary 

COP copula 

CVB (sequential) converb 
F feminine 

FUT future 

HAB habitual 

HUM human 

IMP imperative 
IMPF imperfective 
INAN inanimate 
IND indicative 

INF infinitive 

IPFV imperfective 
IRR irrealis 

LNK linker 

M masculine 
MID middle 

N neuter 

NAR narrative 
NEG negative 

NF non-feminine 
NH non-honorific 
NMLZR nominaliser 
NHUM non-human 
NPST non-past 

OPT optative 
PERF perfect 

PNG person/number/gender 
PRS present 

PST past 

PTCP participle 

Q interrogative 
SG singular 

SUBJ subjunctive 
TAM tense/aspect/mood 
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Appendix A 


Languages in the sample (for the literature consulted, 
see Part Il of the references) 


Glottocode Language name 
Indo-Aryan 

awad1243 Awadhi 
beng1280 Bengali 
bhoj1244 Bhojpuri 
bund1253 Bundeli 
dara1250 Darai 
goan1235 Goan Konkani 
gujal252 Gujarati 
hind1269 Hindi 
kash1277 Kashmiri 
mait1250 Maithili 
maral378 Marathi 
marw1260 Marwari 
nepal254 Nepali 
oriy1255 Odiya 
sadr1248 Sadri 
Dravidian 

brah1256 Brahui 
dand1238 Dandami Maria 
pott1240 Gadaba 
nort2699 Northwest Kolami 
sout1549 Southeast Kolami 
kuii1252 Kui 

kuru1302 Kurukh 
kuvi1243 Kuvi 

nucl1305 Kannada 
mala1464 Malayalam 
saur1249 Malto 

telu1262 Telugu 

Munda 

bond1245 Bondo / Remo 
gata1239 Gta? 
hooo1248 Ho 

juan1238 Juang 
khar1287 Kharia 
kork1243 Korku 
mund1320 Mundari 
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sant1410 Santali 
Soral254 Sora 
Iranian 

sout2642 Balochi 
Isolates 

niha1238 Nihali 
kusu1250 Kusunda 
Appendix B 


Map of languages, mapped with the help of lingtypology 
(Moroz 2017) 


Languages in the sample 


pe © 
Afghanistan e Islamabad DO 
Indc 
I 
Faisalabad. «Lahore Isolate 
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e 
e New Delhi We pal 
Pakistan - e ° 9, Kathmandu ipe om 
@ . Jaipur «Lucknow 
o * [À 
Kanpur 
e Karachi e e " @ Bangladesh 
India e .9 9. Dhaka 
exten » Ahmadabad è 
e 9. Kolkata. * de d 
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s Mumbai e m 
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e 
e 
Arabien: A BayoiBengal 
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Appendix C 


Features documented in the database 


GB107 


GB298 
GB299 


SA0T71 
SA072 
SA073 
SA074 
SA075 


SA076 
SA077 


SA078 


SA078a 
SA079 


SA079a 
SA080 


SA080a 
SA081 


SA081a 
SA082 
SA083 
SA083a 
SA084 
SA084a 
SA085 


SA086 


Can standard negation be marked by an affix, clitic or modification of the 
verb? 


Can standard negation be marked by an inflecting word ("auxiliary verb")? 
Can standard negation be marked by a non-inflecting word (“auxiliary 
particle")? 

Can standard negation be marked by a prefix/proclitic? 

Can standard negation be marked by a suffix/enclitic? 

Can standard negation be marked by a word-final suffix/enclitic? 

Can standard negation be marked by a non-final suffix? 


Can standard negation be marked by an inflecting word homophonous with 
or deriving from the copula? 


Can copula verbs be negated though suppletion? 


Can standard negation be marked through suppletion with non-copular 
verbs? 

Can standard negation be marked by an inflecting word together with a 
finite predicate? 

Is this an asymmetric negation strategy? 

Can standard negation be marked by an inflecting word together with a 
participle? 

Is this an asymmetric negation strategy? 

Can standard negation be marked by an inflecting word together with an 
infinitive? 

Is this an asymmetric negation strategy? 


Can standard negation be marked by an inflecting word together with a 
type of verb form other than those in SA078-SA080? 


Is this an asymmetric negation strategy? 

Are there different negation strategies based on any TAM categories? 

Are there different negation strategies based on tense/aspect categories? 

Is this an asymmetric negation strategy? 

Are there different negation strategies based on mood categories? 

Is this an asymmetric negation strategy? 

Arethere markers for TAM which have the same form but different values in 
standard negation than in non-negation? 


Is there any asymmetric negation in this language? 
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Appendix D 


Language clusters suggested by the two algorithms, 
mapped with the help of lingtypology (Moroz 2017) 


«Kabul - Cluster 1 
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Cluster 1 
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1 Introduction! 


Sarcasm, the deliberate attempt to point out, question, or ridicule at- 
titudes and beliefs by using words or gestures in ways that run coun- 
ter to their normal meaning, is probably a universal of human socie- 
ty. Like all ironic discourse (itself a subcategory of insincere speech) 
the power of sarcasm depends on the listener's being (or becoming) 
aware that the speaker does not mean what is being said. Often the 
gap between what is said and what the participants in a conversa- 
tion all know to be true is sufficient to tip off listeners. Perhaps be- 
cause this pragmatic kind of sarcasm occurs so commonly, it does 
not usually fall within the range of phenomena that a grammarian 
might feel obliged to account for. For instance, while the intention 
of (1) in the circumstances of its utterance may indeed be sarcastic, 
there may or may not be any audible indicator of sarcastic intent: 
The hearer may have to depend on the mismatch between the literal 
meaning of the message and what he knows to be reality (IM = im- 
plied / intended meaning):? 


(2  $abas Chu-y C&ényis gatijar-as 
bravo is-2SG.DT your.DAT wisdom-DAT 
‘Congratulations on your wisdom’. 

(IM: 'You're not so smart as you think!) 


But there are other kinds of sarcasm in which the speaker must in- 
dicate his intent through some behavioural or linguistic cue: a lexi- 
cal item (2), a special intonational contour (3), over-articulation (4), 
pauses (5), inappropriate formality (6), overcareful framing (7), hy- 


1 The first part of this paper is modelled on a paper co-authored by Hook and Kusum 
Jain entitled "How to be Sarcastic in Hindi-Urdu”, drafted in India during the summer 
of 1997 and published in 2002 in a felicitation volume for George Cardona. There is a 
running comparison of the modes of sarcasm in Kashmiri and Hindi-Urdu from fn. 7 
onward and a comparison of dedicated sarcasm constructions in Indo-Aryan and Dra- 
vidian languages in $$ 5 and 6. 


2 The transcription used for Kashmiri in this paper is the one worked out by Kenneth 
Hill and Sajad Mir in Prof. Hill's course in linguistics field methods taught at the Uni- 
versity of Michigan in Fall, 1984. Based on a system often found in the linguistics lit- 
erature on contemporary Indo-Aryan languages, it was designed to minimise the use 
of diacritics and special symbols. The letter (e) represents a mid (either front or cen- 
tral) vowel while (i) represents a high (either front or central) vowel. Fronting is de- 
termined by the presence of (y) or other palatal consonants. Palatalisation is uniform- 
ly indicated with the letter (y) (except that {j}, {ch}, {č}, and {š} are inherently pala- 
talised); (ts) is a dental affricate; and (t), (th), and (d) are retroflex stops. In the tran- 
scription of data from Hindi-Urdu the letter (e) always represents a mid front (never 
a central) vowel while (i) always represents a high front (never a central) vowel. See 
the list of abbreviations. 
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perbole (8), understatement (9), mimicry (10), inversion (11), a wink, 
a deadpan expression etc:? 


(2 Sure he’ll return your books! (IM: ‘He won'treturn your books’) 
(3 Great! (with a prenuclear drop in (im: ‘Terrible!’) 
pitch) 
4) Pretty bad! (post-tonic geminate: — (IM: ‘Extremely bad!) 
['pritti])^ 
5) What...a... find! (IM: ‘This is trash!’) 
6) William Winkler, you clean up your (im: ‘Billy, l'll withdraw my normal 
room! affection, if you don’t clean up your 
room’) 
7) What seems to be the problem? (im: ‘You’re not sick. You just think you 
are! Don't waste my time!) 
(8 He’s a genius! (IM: ‘He’s far below average") 
9) Iwouldn’tbetonit.? (im: ‘Absolutely not!’) 
10) Did Wicky spwain his wittle wist? (ım: ‘Ricky is a childish complainer’) 
11) Don’tlet pregnancy spoil your (The usual message is reversed with the 
drug intention of persuading drug addicts to 


get themselves sterilised) 


Many of these phenomena are fair game for the grammarian. An ac- 
count of any linguistic cue (intonational, morphological, lexical, or 
syntactic) that distinguishes sarcastic from non-sarcastic utteranc- 
es in a regular or predictable way falls within the jurisdiction of the 
grammarian. However, in our discussion of Kashmiri we will be par- 
ticularly interested in lexical and constructional cues that are them- 
selves sites of sarcasm. 

Mechanisms of sarcasm can be classified by target into two 
groups: A. Certain expressions invoke and attack beliefs of the per- 


3 Haiman (1990) attempts an exhaustive taxonomy of linguistic cues to sarcastic in- 
tent. Jagannathan (1981, 338) discusses the sarcastic use among Hindi-speakers of ap- 
pellations like guru and xalifa. Cf. Taing 1984 for irony in Kashmiri literature. 


4 The apical stop in sarcastic pretty shows an affective gemination that blocks the 
flapped articulation normally expected for post-tonic intervocalic /t/ (thanks to Alexis 
Manaster-Ramer and Bill Darden for this observation). A similar gemination with dis- 
placement to the left of the tonic (using extra high pitch) is audible in the sarcastic ar- 
ticulation of okay: ['okke]. 


5 Understatement is typical of ‘dry’ sarcasm. A more complex example: “On this date 
in 1492 Christopher Columbus signed a contract with the Spanish Crown to sail the 
ocean blue... in search of Asia. He did not find it" (National Public Radio's Morning 
Edition, 17 April 1998). This instance shows several cues of sarcastic intent: 1. Partial 
quoting of the children's rhyme "In 1492 Columbus sailed the ocean blue"; 2. the extra 
pause between ‘blue’ and the phrase ‘in search of Asia’; 3. the understated ‘He did not 
find it’; 4. a prolongation - with rising pitch - of the nucleus of ‘find’. The sting of this 
sarcasm is dilute: its targets - Columbus and his royal backers - now all safely dead - 
are gently twitted for their geographic illusions 
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son to whom they are addressed (i.e. the hearer). B. Others impugn 
the presumed beliefs of some participant in the situation denoted by 
the utterance. When the speaker himself or herself is the target, the 
sarcastic intent of the utterance is usually directed to his or her pre- 
vious beliefs and the sting is lightened to what may be characterised 
as the ironic expression of regret. 


2 Hearer-Oriented Sarcasm 


Among the cues Kashmiri speakers use to signal that beliefs of the 
hearer are under attack is the use of the simple past tense (aka 
preterite)' to denote actions that the hearer knows very well have 
yet to occur: 


(12) temy dyity-iy ti tsye rety-th-as! 

he.ERG — gave-M.PL-2SG.DT and you.ERG took.M.PL.2SG.ER-3SG.DT 

‘He gave (them) to you and you got (them) from him!’ (’them’ refers to money) 
(im: ‘He will not give you (the money) and so of course you will not get it from him") 


This use of the past tense for future actions provides a cue to sar- 
castic intent that can be reinforced with hay 'indeed' or its abbrevi- 
ated affixal form -ay: 


6 Jagannathan (1981, 337) draws a further distinction between ‘sharp’ sarcasm (tikha 
vyangy) in which the addressee is supposed to recognise the speaker's intent and 'sub- 
tle' sarcasm (süksm vyangy) in which only hearers other than the addressee are sup- 
posed to realise that the addressee is a target. 


7 This sarcastic use ofthe past tense for future action has its Hindi-Urdu counterpart 
in the auxiliary use of V čuk- ‘have already V-ed' which derives historically from the 
main verb čuk- ‘be finished, used up’. As an indicator of sarcasm, auxiliary čuk- corre- 
sponds to English cues of sarcasm like ‘sure!’ or ‘you bet!’. Its simple past (i.e. preter- 
ite) tense as in (a) [an illustration from Dasa et al.'s 1965-75] and in (b) most typically 
negates an act in the future: 


(a) tum ab a Cuke! (arthat ‘tum ab nah? a-oge’) 


you now come already thatis you now not  come-2PL.FUT.M.PL 
‘You’ve already come!’ (Intended meaning: ‘You will not come now.’) 


(b) us.ke pas khà.ne ke.live paise nahi tumhare paise vāpas de čukā! 


him near eating for money not your money back give has.already 
‘He doesn't have enough for food! I’m sure he's gonna return your money!’ 
Im: bhül jào: vo tumhé paise nahî dene kā (hai)! 
forget GO he you.DAT money not giving of is 


‘Forget it! He's not about to give you your money!’ 
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(13) /suhay / s-oy/ av! tsi gatsh-akh  khwas! 
/heindeed //  he-indeed/ came you go-FuT.2sG happy 
‘He came indeed! You may rejoice! (ım: ‘He’s not about to come so don’t get 


all happy!’) 
(14) b-ey go-s tsi Chu.kh khots.an 
l-indeed went.M.SG-NM.1SG you are fearing 


‘Sure | went. You are worrying? (IM: ‘Don’t worry! I’m not gonna go!) 


Indeed, the particle hay or its affixal counterparts* are sufficient in 
themselves to mark sarcastic intent: 


(15) ts-ey yi-kh tang koryen manz 
you-indeed | come-rurT.2sc tight girls.DAT.PL among 


‘You will be bored in the company of girls!’ (IM: ‘Sure you'll be bored with all 
those girls around!) 


Another common cue of sarcastic intent is use of the invariant 
(oblique singular) form of bad- 'big' as an adverb of quantity: badi 'a 
lot; very’ (sarcastically ‘sure; you bet!’):° 


8 Note the sandhi: su + hay => soy ‘he indeed’; tsi + hay => tsey ‘you indeed’; bi + 
hay => bey ‘I indeed’. 
9 The Hindi-Urdu parallel to this is the use of the adverb of quantity bar- ‘a lot’, 


sometimes in its masculine singular default form bara (a) (Kusum Jain from Hook, Jain 
2002, 364): 


(a) vo bar-a d-egi parti! 
she big-DEF give-FUT.F.SG party(F.SG) [DEF = default = M.sc] 
‘Sure she’s gonna give a party!’ 


and sometimes as an absolutive concordant adverb agreeing in gender and number with 
an intransitive subject (b) or a (transitive) object (c): 


(b) tum bare  G-oge madad kar-ne! jhüthe vayde karte ho! 
you big-M.PL come-2.FUT help  do-iNF false promises make are 


‘Sure you'll come to help us! You don’t keep your word.’ (Kusum Jain from 
Hook, Jain 2002, 364) 


(c) vo bar-i d-ega parti! 
he big-F.sc give-3SG.FUT.M.SG party(F.SG) 
‘Sure he's gonna give a party!’ (Kusum Jain from Hook, Jain 2002, 364) 
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(16) swa Cha alitsy; swa kar-yi badi | vudyüg! 
she is lazy she do-3sG.FUT a.lot hurry 
‘She is very lazy; sure she'll be quick!’ 


That the adverb badi in these examples functions as an adverb which 
takes the entire utterance in its scope may explain its unusual abil- 
ity to occur in root clauses together with another constituent to the 
left of the finite verbal element without forcing the other constituent 
to change its position: 


(17) ts-ey badi yi-kh kéésyi bakar! 
you-indeed a.lot come-FUT.2SG someone.DAT assistance 
‘Sure you will be of assistance to someone!’ 


In its power to condition V-3 word order sarcastic badi should perhaps 
be grouped together with other sentence-operators such as the set 
of Wh-words. Or the particle hay / -ay / -y may reset the V-2 count.*° 
Compare (18) with (19) and contrast them both with (20) where badi 
is not being used sarcastically: 


(18) ts-ey badi khwas gatsh-akh! 
you-indeed a.lot happy gO-FUT.2SG 
‘Sure you will be happy!’ 


(19) tsi kat khwas gatsh-akh! 
yOU.NOM how.much happy go-FUT.2SG 
‘How happy you will be!” 


(20) tsi gatsh-akh badi khwas! 
you go-FUT.2SG a.lot happy 
‘You will be very happy!’ 


10 See Hook, Koul forthcoming for examples and discussion of the role that the em- 
phatic particle -(a)y ‘indeed’ may have in conditioning V-3 word order in Kashmiri root 
clauses. 
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3 Neutral Sarcasm 


There is sarcasm of a second kind in which the speaker is not con- 
cerned with attacking or ridiculing the beliefs in particular of the 
hearer. Rather it is the event or situation itself that is being held up 
to some implicit standard that the speaker assumes his listeners sub- 
scribe to (or believes they ought to subscribe to it). The target is the 
subject of the clause but the focus is on the action or event predicat- 
ed of the subject rather than on his or her (imputed) beliefs or atti- 
tudes. Acue to sarcasm ofthis sort is the use of the exclamatory par- 
ticle -a(h) suffixed to the first constituent of the clause: 


(21) Sakil-a Cha-s! (22) cay-a Cevyi-kh! 
beauty-à is-3SG.DT tea-à — served-3PL.ER 
‘What a beauty he / she is!” ‘The tea they served!’ 


If the hearer happens to be the subject then he or she becomes the 
target: 


(23) hyemith-a  keri-th! (24) poz-a(h) Chukh van-an! 
courage-à  did-2sG.ER truth-a  are2sc  tell-ing 
‘What courage you showed!’ ‘Right!’ 


Why are rhetorical questions recruited as a mode for delivering sar- 
casm? Perhaps because they have the same form as real questions, 
they may provide the sarcast with desired cover: it is harder for the 
hearer to stop the conversation to make an explicit complaint or ac- 
cusation if the speaker's words (and intentions) are ambiguous. It is, 
after all, the desire of the speaker to deliver a psychological blow 
without assuming all the risks of making an explicitly hostile remark 
that is the fundamental motive for using sarcasm. 


4 Hearer-Oriented versus Subject-Oriented Sarcasm 


The cues we have examined so far are used to indicate to the hear- 
er that the speaker does not subscribe to the hearer's views (25). In 
(26a-b) we may observe another kind of sarcasm, one which targets 
not the hearer of the utterance but rather its 'subject' (usually an 
agent-subject or experiencer-subject). Sarcasm of this subject-orient- 
ed kind shows up when the speaker assumes that the hearer agrees. 
Contrast (25) with (26a-b): 
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(25) su hàv-yi badi panun bajar 
he show-FUT3sg a.lot self's greatness 
‘Sure he'll show his greatness!’ [IM: ‘He's not so great (as you think he is)!] 


(26a) badi üv bajar hav-an.vol 
a.lot came.M.SG greatness show-er.M.SG 
‘Here comes the big cheese!’ [IM: ‘He is not so great (as he thinks he is)!’] 


(26b) badi ayi bajar hàv-an.vàjinyi 
a.lot came.F.PL greatness show-ers.F.PL 
‘Here come the big cheeses!’ 

[IM: ‘They are not so great (as they think they are)!’] 


In (25) the speaker is using sarcasm to undermine or ridicule the 
hearer’s expressed or implied position that someone has great po- 
tential. Whereas in (26a) and (26b) the speaker is not attacking the 
views of the hearer but rather ridiculing the self-indulgent behaviour 
or the pretensions of the subject. 

If the hearer happens to be the subject then it follows that the sar- 
cast is attempting to degrade her or him: 


(27) tswapi kar! badi ay-akh tatyi pyethi sad  ben-yith 
silence make a.lot came.F.s6-2SG.NM there from saint become-GER 


‘Shut up! You've come from there so pure and holy!’ [IM: ‘You’re not so holy as 
you think!’] 


5 Is there a Construction Dedicated 
to Subject-Oriented Sarcasm? 


The construction in (26ab) and (27), widespread in languages spo- 
ken on the northern and western sides of India, displays a specific 
pattern, definable as in (28). One may speak of a 'dedicated' con- 
struction: 


(28) (subj) + aggrandising element (or Wh-) + finite form of {come} + mocked action, 
state, attitude!! 


11 On hearing some of these examples Colin Masica (pers. comm.) objected that a 
sarcastic construction "Here comes / a // the / big X" exists in English (and presuma- 
bly in all languages). Of course, in the right situation and with the right intonation any 
utterance can be interpreted as sarcasm. Construction (28) as illustrated in (26)-(27) 
and (29)-(40) differs from speech overlain with sarcastic intonation in being special- 
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(29) khuba a-ek-i Che  sevà gar-na! [Nepali] 
a.lot |came-PP-FsG is.F.SG service do-INF (Netra Paudiyal, pers. comm.) 
‘She’s come to help!’ [IM: 'She's too late to help.’ or ‘She’s not one to help others.’] 


(30) bar-i à-i Cavi dend-àr-i... [Garhwali] 
big-F.sG came-r.sc key  giving-NOM.AG-F.SG 
‘So here she is, the big key-giver!’ (Ghildiyal 1981, 49) 


(31 vo  bar-i àa-i paropakar  kar.ne-vàl-i! [Hindi-Urdu] 
big- F.SG came-r.sc help do-er-F.SG 
‘Here she comes, the Good Samaritan!’ (Hook, Jain 2002, 365) 


(32) badd-î ay-1 üc-r batà karna-Gl-7 [Bagri/Haryanavi] 
big-F.sG came.r.sc high-r.PL things do-er-F.sG 


‘Here she is, the big talker (IM: the pompous ass!)' (Lakhan Gusain, personal 
communication) 


(33) vad-i à-i pari-likhi... pari-likhi hai to 
big-F.sG  came.F.sc read-rF.sc-written-r.sc read-F.SG-written-F.s6 arethen 
is-ka mining batà 


this-GEN meaning tell 


‘Here she is, the highly educated one. So, if you're so educated, tell me the 
meaning of this!’ (sharechat.com/video/MxQ8q4W?referrer=url) 


[Panjabi] 

(34) vad-o ayo mohabbat kar-na-var-o 
big-M.sG came.M.sG love do-INF-ER-M.SG 
‘Here he is, the great lover!’ [Sindhi] 
(sindhiadabiboard.org/Catalogue/mehran/Book25/Book pageT. 
html) 

(35) ja ja pàagalgayo Che, mot-o āvy-o papi mag.va-val-0 
go go crazy gone are big-m.sG came-m.sG kiss ask-er-M.SG 
koi juve to sù kah.e 
someone see then what say 


‘Get away. You crazy? The big kiss-demander! What would someone say if 
they saw?’ (gadyasarjan.wordpress.com/2012/12/26) [Gujarati] 


ised for [or dedicated to] the delivery of sarcastic intent. That is, for speakers to ut- 
ter any of these examples without sarcastic intent is unlikely maybe even impossible. 


71 


Bhasha e-ISSN 2785-5953 
1, 1, 2022, 63-76 


Peter Edwin Hook, Omar N. Koul 
A Dedicated Sarcasm Construction in Kashmiri as a Feature of the South Asian Linguistic Area 


(36) ha kon ala tikojirav amhaà-la sang-nar-à [Puneri Marathi] 
he who came-Msg nosy.parker us-DAT  tell-PRESPRT-Msg 


‘Who is this kibitzer-shmitzer to tell us (what to do)?’ (https://www.maayboli. 
com/node/26768) 


(37) lay al-à sallà | de.nàr-à [Wardha / Nagpuri Marathi] 
much came-M.sc advice giver-M.sG 


‘Here comes the great advice giver!' (P. Mashram and R. Mhaiskar, via 
Sonal Kulkarni-Joshi) 


(38) vhell-€ ayl-G sell» div-pa-k [Goan Konkani] 
great-F.SG  came-F.sc advice give-INF-DAT 


‘She’s a great one to give us advice! (N.F. Gaonkar and G. Mopkar, via Sonal 
Kulkarni-Joshi) 


(39 pedda vatts-àdu badhyata gala paurud.i-laga [Telugu] 
big come-PST.M.SG responsibility with citizen-like 
‘...as if he were a responsible citizen!" 


* Validity of Telugu data, glossing, and analysis confirmed by K.V. Subbarao, 
Peri Bhaskararao, and Shalinee Gusain. 


(40) aval periyya vant(-utt)?-a(l) enak-ku camaiyal  collit-tar-a 
she bigly came-LET-F.sG — me-bpAT cooking teach-GIVE-INF 
tanak-k.6 tocai küta vàrk-ka  teriyatu 
self-DAT dosa even pour-INF know-NEG 


‘She’s a good one to teach me how to cook. She can't even manage to make a 
dosa herself!’ (This Tamil example is from Kanaka Jagannathan, via Bharati 
Jagannathan, personal communication)? 


12 The morpheme -utt- -LET- is the colloquial abbreviation of vitt-, the past tense form 
of vector (vi)tu (LET GO, RELEASE]. See Annamalai 2021, 308 ff. for detailed descrip- 
tion of (vi)tu and other Tamil vectors. 


13 The Tamil example in (40) patterns identically to most of the Indic examples in 
(26a-b)-(38). However, not every Tamil speaker accepts periyya in (40), preferring in- 
stead the adverb perusa. (Umarani Pappuswami, pers. comm.): 


(a) per-usa vant(-utt)-G(I) enak-ku ^ camaiyal collit-tar-a! tanak-ke tocai kita 


vàrk-ka — teriyatu 


72 


Bhasha e-ISSN 2785-5953 
1, 1, 2022, 63-76 


Peter Edwin Hook, Omar N. Koul 
A Dedicated Sarcasm Construction in Kashmiri as a Feature of the South Asian Linguistic Area 


6 Is the Sarcasm Construction Displayed in (28) 
a Feature of the South Asian Linguistic Area? 


In all of the South Asian languages surveyed expression of sarcasm 
may use either a concordant form of a mocking aggrandising adjec- 
tive bar- ‘big’ (Hindi, Garhwali) / badd- (Bagri, Haryanavi) / vad- (Pan- 
jabi, Sindhi) / mot(h)- (Gujarati, Marathi) / vhell- (Konkani); an invari- 
ant adjective bhari ‘heavy’ (Bengali) / maha ‘great’ (Kannada) / pedda 
‘big’ (Telugu); an adverb perusa (Tamil) / badi (Kashmiri) / khuba (Ne- 
pali) ‘bigly, a lot’ / lay (Nagpuri) ‘very’; or an interrogative pronoun 
kon (Marathi) ‘who’. Contrastingly, the full pattern displayed in (28) 
is not found in Bengali (Probal Dasgupta, pers. comm.), while Kan- 
nada has what seems like an inverted form!‘ of it (S.N. Sridhar, pers. 
comm.). The distribution of the templatic pattern shown in (28) across 
most of Indo-Aryan in a contiguous block taken together with its pres- 
ence in Dravidian Telugu and Tamil accords with its being regard- 
ed as an incomplete or fragmentary feature of South Asia as a lin- 
guistic area (Masica 1976),!° perhaps one yet to reach its full extent. 

It remains to be seen if specialised constructions or explicit mark- 
ers of sarcastic intent parallel to Kashmiri's badi yi- V-an-vol- (26a) 
are found in languages spoken outside South Asia and if so whether 
they are commonly used in the languages that have them.!5 


14 Notice that in (a) with respect to the template in (28) finiteness and non-finiteness 
of forms have switched places: 


(a) avalu ba-nd.u (maha)  nan-ge heli-kod-tà-le [Kannada] 
she | come-PPART great me-DAT.EMPH tell-GIVE-NON.PST.F.3sg 
‘She presumes to teach me - (the big know-it-all)’. (S.N. Sridhar, personal communication) 


This difference brings the Kannada construction closer to normal South Asian SOV 
word order. Thus, rather than constructionally, the cue to sarcastic intent must be in- 
dicated intonationally and/or by the presence of maha. 


15 See Emeneau’s definition of a linguistic area: “an area which includes languages 
belonging to more than one family but showing traits in common which are found not 
to belong to the other members of at least one of the families” (1956, 16 fn 28). 


16 Does the construction in (28) share features with other kinds of insincere speech? 
One reason for thinking it may is that in all these languages the finite form of {come} 
is not in its normal clause-final position. In all of them some material follows the verb: 
an infinitive, a participle, or a noun phrase expressing the actions or attitudes on the 
reality or legitimacy of which the sarcast casts doubt. Comparable to this displace- 
ment from the canonical clause-final position of the finite verb is the ‘move-left’ phe- 
nomenon observed in Hindi-Urdu expressions of irony, especially those involving in- 
ceptives (Hook 2011): 


(a) lage aurò ki.tarah tum bhi càplüsi kar-ne 
begun.2PL.M others like you too flattery do-INF 
‘There you go, just like the others, trying to flatter me’. (Premchand [1936] 1960, 51) 
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Abbreviations 


NM 

NOM 
NON.PST 
PL 
PPART 
PRESPRT 
PST 

SG 


Bibliography 


first person 

second person 

third person 

ablative 

accusative pronominal suffix 
agentive 

dative 

default or invariant form 
dative pronominal suffix 
emphatic particle 

ergative pronominal suffix 
ergative 

feminine 

future 

genitive 

gerund 

implied / intended meaning 
infinitive 

masculine 

negative 

nominative pronominal suffix 
nominative 

non-past 

plural 

past participle 

present participle 
pasttense 

singular 
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1 Introduction 
bhasa samskrtapabhramsah bhasapabhramsastu vibhasa | 
satattaddesa eva gahvaravasinamca prakrtavasinamca || 
Abhinavabharati 17.49 


The corruption of Sanskrit is bhasa, and the corruption of bhasa is vibhasa. 
Itis the language ofthe same countries of forest dwellers and of rustic people. 
(Kavi 1934, 376) 


This paper is part of the Imagining Sanskritland project, which fo- 
cuses on locating and documenting how, where, and why the Middle 
Indo-Aryan language, Sanskrit, is spoken in the twenty-first centu- 
ry. This builds on the previous work of Nakamura (1973), Hock and 
Pandaripande (1976), Hock (1991), Aralikatti (1989; 1991), Aklujkar 
(1996), Hastings (2004; 2008), and Deshpande (2011; unpublished). 
Generally, this project expands beyond linguistic analysis of sen- 
tence structure to document the aspirations, ideologies, and moral 
horizons inherent in identifying as a speaker, as well as document- 
ing second-language acquisition through a focus on imperfect learn- 
ing, substrate interference, and bilingualism. 

The project first focused on code-switching between Hindi and 
Sanskrit and the transubstantiation of symbolic capital in a resi- 
dential Sanskrit college/yoga ashram in Gujarat, India (McCartney 
2011; 2014a; 2017a; 2018a). Another focus was the two-week intensive 
speaking course in New Delhi (McCartney 2014b). The focus pivoted 
to include a relatively famous 'Sanskrit-speaking' village in Madhya 
Pradesh, India (McCartney 2015; 2016a; 2016b; 2016c; 2017b; 2017c; 
2017d; 2018b), which includes discussion of language revival and hy- 
bridisation (McCartney, Zuckermann 2019). This combines with ana- 
lysing the theo-politics of Sanskrit's imaginative consumption within 
the transglobal wellness industry and the topic of 'yoga fundamental- 
ism', which maps out the distanced and banal ways that consumption 
of yoga lifestyles can potentiate tacit and unwitting support of Hindu 
supremacism (McCartney 2017e; 2017f; 2017g; 2019a; 2019b; 2020). 
More recently, the project has pivoted to cover matters related to 
Sanskrit's soft power potential within the context of faith-based, sus- 
tainable development and competitive diplomacy. This involves envi- 
ronmental impact assessments of yoga lifestyle brands and the culti- 
vation of nostalgic moods predicated by Neo-Romanticism, mystical 
holism, and dark green religion (McCartney 2021a; 2021b; 2021c). 

Informed by Deumert's (2009) and Posel and Zeller's (2015) de- 
mographic analyses of census data related to language shift and bi- 
lingualism in South Africa, this project pivots to look for Sanskrit 


I would like to thank Andrey Klebanov for helping with particular issues in develop- 
ing this paper. 
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(‘speakers’) in the Indian census data. An initial result, upon which 
this present paper directly builds, is McCartney (unpublished). The 
policy of India's census enumeration states that if the number of 
speakers of any language drops below 10,000 then it will no long- 
er be reported as a separate language (Goswami 2012).* If Sanskrit 
were to dip below the threshold, this would have unintended con- 
sequences for its soft power deployment. Therefore, the politics of 
census enumeration for the purposes of state building are relevant. 
Simply looking for Sanskrit speakers is something of a fool's errand. 
Sanskrit is a post-vernacular language in a perpetual state of acqui- 
sition. The media only reports the perceived success of Sanskrit's re- 
vival (Indian Yug 2020). 

In short, India's 2011 census results clearly demonstrate that San- 
skrit 'speakers' are overwhelmingly found in urban areas spread 
across the ‘Hindi heartland’,* which is only a part of India's complex 
linguistic ecology and "linguistic area" (Emeneau 1956). What this 
paper does is use the Indian government's census data to geograph- 
ically locate where people who identify as speakers of Sanskrit were 
at the time of census enumeration. For now, this is as good as it gets, 
as the data presented below are not capable of verifying the fluen- 
cy of people who claim to speak Sanskrit. Discussion of several is- 
sues relating to this are found across the project's publications men- 
tioned above. The outcomes from this present paper include future 
research being more strategic. 

Even though some consider Sanskrit to be the "language ofthe ru- 
ral masses" (Deopujari 2009) and the "language of future India" (Mo- 
han 2020), it is also thought that "Sanskrit is the forgotten language 
of urban India" (Indian Eagle 2020) and that "NASA believes San- 


1 This is one reason why the RSS (Rashtriya Svayamsevak Sangh, ‘National Volun- 
teer Corps’) “wants citizens to voluntarily register Sanskrit as their second language 
in the census. The RSS feels that if people register the language, the final census da- 
ta would reflect higher literacy of Sanskrit, which will force the government to take 
measures to preserve the language" (Tare 2010). 


2 Compare the notion of the "Vedic God" by Hebden (2011). The geographic focus of 
this paper is the 'Hindi Heartland' or 'Hindi Belt', which covers most of the plains of 
north India, where Sanskrit's close relatives - Hindi and its related languages - are spo- 
ken. This is pertinent because, as is discussed below, this is the linguistic area where 
the census results indicate most of the 'Sanskrit speakers' live. As is evident, below, 
the link between people who identify as Sanskrit speakers with both Hindi and Eng- 
lish is immense. By way of example, the state of Bihar's Sanskrit areas is discussed be- 
low. This paper expands upon Jha's (2017) explanation of how studies of language poli- 
tics in north India tend to focus on the Hindi-Urdu debate. This debate builds on a cen- 
turies-old development of language order in premodern India (Ollet 2017), taking on 
communalist narratives. This culminated in the nineteenth century around which of 
these mutually intelligible languages - Hindu and Urdu, which derive from Hindustani, 
but use different scripts, respectively, Devanagari 434T'R and Nasta liq 331323 - should 
become the national language. Currently, India does not have a national language. In- 
stead, it has two official languages, Hindi and English (Dasgupta 1995). 
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skrit is a scientific language for programming" (TNN 2019). More- 
over, Sanskrit is thought to be a "gift of India for [the] entire hu- 
manity" (India Education Diary Bureau 2020) and that the "Sanskrit 
effect" is caused by "chanting Sanskrit", which "increases brain cog- 
nitive areas" (Sanskriti 2018). The benefits of chanting are predicted 
to extend beyond humans, as "Cows will talk in Tamil and Sanskrit" 
(Patherpanchali 2018). Back down on the ground, it is difficult to lo- 
cate Sanskrit speakers because the available information are unre- 
liable factoids mentioned on the internet,? copied and pasted from 


3 Yet,these lofty ambitions have humble origins amongst the mythical villages, whose 
inhabitants are meant to be grateful that Sanskrit's perceived civilising power will fi- 
nally reach them, even if this ideological benevolence is soaked in a neo-colonial San- 
Skritisation impetus that is made explicit in ways, such as "Samskrtam sarvesam krte... 
sarvada; Sanskrit for everyone... forever" (Amaravani 2020). Strength, it seems, is not 
found in linguistic diversity. Amara-vàni (immortal-language) is ultimately a part of 
Samskrita Bharati, which itself is the linguistic node of the more prominent Hindu na- 
tionalist parent organisation, the Rashtriya Svayamsevak Sangh (RSS). It is through 
its international branch, the Hindu Svayamsevak Sangh (HSS), that Samskrita Bharati 
operates at an international level. It mostly services the Indian diaspora through cul- 
tural and linguistic events, which can be an opportunity for the collapsing of the 'big' 
versus 'little' tradition binary (Vertovec 1994), such as a potential muddling of San- 
skrit as it goes through its interlanguage stage of acquisition. However, the expan- 
sion beyond the imagined borders, especially through extending into cyberspace, re- 
quires a recalibration of relations, especially to the punya bhümi (‘sacred land’) of the 
Hindutvavadins imagination, within the context of transnational development and mul- 
tiple modernities (Jaffrelot 2017). After all, "Sanskrit is a gift of India for entire hu- 
manity". At least, that is what India's HRD Minister, Ramesh Pokhriyal, asserted just 
after the Central Sanskrit Universities Bill, 2020 was passed by India's upper house 
of parliament to upgrade three Deemed Sanskrit Universities to Central University 
status (India Education Diary Bureau 2020). Amaravani promotes Sanskrit through 
songs. One example is the song Visva-bhasa Samskrtam (‘The Universal-Language is 
Sanskrit’). Information on the song's page, on Amaravaàni's website, also claims, that 
"There are many villages in India where the entire population speaks solely and fluent- 
ly in Samskrtam!" (Amaravani 2020). Such truth claims are a curious thing. I am re- 
minded of one verse from a seventh century Ayurvedic text, which discusses poor vi- 
sion resulting from false perception: "Dürantikastham rüparica viparyasena manyate | 
dose mandalasamsthane mandalaniva pasyati || AS.Utt.15.4 ||" “Due to false perception 
(viparyasa), a patient perceives a thing located far away, as close by, and things located 
close by, as far away" (Asthangasangraha, Uttarasthanam, 15.4 [Vagbhata 2020]). Ina 
topsy-turvy way, viparyasa refers to the act of imagining something to be real and true, 
when it is false. The term niscaya can mean both correct perception and enquiry. We 
find in the Vaisnava tradition encouragement to cultivate niscaya (Srimad Bhágavatam 
3.26.30), which is recommended for both soteriological aspirations and mundane mat- 
ters. This concept is similar to that of Nagarjuna (c. 150 CE-c. 250 CE), who discussed, 
in the Mahaprajhaparamitasastra, the concept of pratityasamutpada. This refers to the 
"basic principle of thought that no two contradictory judgements can hold good in re- 
gard to the same thing in the same respect" (Ramanan 1987, 167). In other words, 'San- 
Skrit-speaking' villages either are true and do exist or they do not. There is very lit- 
tle available evidence, be it documentary, direct, real, circumstantial or testimonial. 
Furthermore, what percentage of a village's population and to what degree of fluency 
and ordinal ranking of usage, considering frequency of code-switching, domains and 
topics, might be the minimum requirements to satisfy claims of any village being one 
that is 'Sanskrit-speaking'? 
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other websites lacking accurate details. Nonetheless, these factoids 
are enough for people to believe that 'Sanskrit-speaking' villages 
exist.^ People rely on this meme to provide emotional reinforcement 
for deeply held religious beliefs hoping it will re-enchant the world 
to provide profound meaning in one's life. ‘Sanskrit-speaking’ villag- 
es endow the world with a perceived sense of divinity, meaning, and 
significance. As an empty signifier, the village is remote enough to 
remain perceived as an infallible closed circuit. Similarly, Olshan- 
Sky, Peaslee and Landrum (2020) provide insight into the cognitive 
defence mechanisms of flat earthers, which include motivated rea- 
soning to dilute cognitive dissonance and maintain cognitive consist- 
ency. Having visited several of these so-called 'Sanskrit-speaking' 
villages, I became increasingly frustrated, as most of these villages 
contain hardly anyone who can hold a casual conversation across gen- 
eral domains and topics or utter increasingly complex sentence pat- 
terns, and include more complicated use of tense, aspect, or mood. 
Nonetheless, throughout the last decade, my primary question has 
been: "Where are the Sanskrit speakers?”.5 

Jhiri is a village in Madhya Pradesh that I have paid more atten- 
tion to (McCartney 2015; 2016a; 2016b; 2016c; 2017b; 2017c; 2017d; 
2018b) where apparently everyone speaks Sanskrit (Samskrit101 
2009). Oblivious to issues of linguistic human rights (Skutnabb-kan- 
gas 2012), Ghosh (2008) celebrates how the residents "hardly speak 
the local dialect, Malvi, any longer. Ten years have been enough for 
the sanskritization of life here". A potent claim that "even those who 
don't know the technicalities of the language still speak fluent San- 
skrit" (Hindutvainus 2011) is demonstrably false. For instance, while 
conducting fieldwork in Jhiri, one of the residents repeatedly claimed 
that everyone in village fluently spoke Sanskrit. Beaming with pride, 
he often emphatically asserted, “Bhoh, asmin grame iva sarve janah 
samskrtam vaditum sakyante khalu [Sir, in this village everyone can 
speak Sanskrit!]". However, this was easily disproven by speaking to 


4 Myterm, laukika-samskrta-bhasa-aviparyasa-abhasa-samanvesanam, refers to the at- 
tempt to look everywhere (samanvesana) for the semblance (abhasa) and distorted per- 
ception (aviparyasa) of vernacular (laukika) variants, or regional dialects, of Sanskrit 
(samskrta), as a spoken language. This includes its mention in yoga-related outlets (Di- 
zon 2016; Bedewi 2020). By first identifying the 'pseudo-perception' (pratyaksa-abhasa) 
found in discourse (vyavahara), the process moves to falsifying (dusayati) fallacies 
(hetvabhasa) related to the enduring claims about 'Sanskrit-speaking' villages, which 
are legion. 


5 Inlogic, abhasa takes the meaning of 'erroneous though plausible argument’. These 
rumours act as a buffer warding off existential anxiety. At least it is potentially com- 
forting to know that a 'real' and 'true' India still exists and that "India's own Jurassic 
Park" is found in rural Madhya Pradesh, in a village that "is a lost world that has been 
recreated carefully and painstakingly, but lives a precarious existence, cut off from the 
compelling realities of the world outside" (Ghosh 2008). The village holds an ambigu- 
ously utopian relation to future India (Nandy 2001; Mohan 2012). 
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anyone within arm's reach, who would require prompts from others 
as they could not reply to simple sentences, such as "Bhavatah nàma 
kim [What is your name?]" or "Mitram, kutra gacchasi [Friend, where 
do you go?]", let alone pass sentence repetition tests. Yet, these ru- 
mours become factoids and bloom into unassailable facts in support 
of "core Indian values" (Prabhu 2014). One potent example is the use 
of spoken Sanskrit to sell a motorbike, which uses a 'Sanskrit-speak- 
ing' village as the backdrop (Sharma 2009). 

One issue with validating the returned Sanskrit tokens in the cen- 
sus data is that they are self-reported. Another issue is that census 
enumerators are bound by law not to question the maximum three 
languages given. Therefore, if someone believes they speak Sanskrit 
(or any other language) and identify as a speaker of Sanskrit then 
it becomes a demographic fact. Simply put, one token refers to one 
individual instance of someone claiming to be either a L1 ('moth- 
er tongue’), L2 (second-language), or L3 (third language) speaker.” 
These epistemic methodological issues have been discussed at eve- 
ry census (Office of the Registrar General 2020). 

All of the data are publicly available in downloadable spreadsheets 
(Office of the Registrar General 2020). The 2011 data come from the 
relevant C-16 (‘mother tongue")? and C-17 (bi-/tri-lingualism) tables, 
which only became available in late 2018. Multiple versions of these 


6 India,like many nations, has used different and equally ambiguous terminology to 
capture the primary language(s) used by their citizens. There are actually three lan- 
guage 'situations' that can be captured by censuses: (a) the language first used by the 
respondent; (b) the language most commonly used by the respondent at the time of the 
census; and (c) the knowledge of particular official language(s) by the respondent (Arel 
2004). The Indian census does not appear to achieve this tertiary system, simply be- 
cause it does not ask the necessary questions. Is this due to either pragmatic expedien- 
cy, ideology, or both? Building upon Foucault 2007, Duchéne, Humbert, Coray 2018 ex- 
plain the consequences of reducing real world complexities through statistics to quan- 
tifiable categories. These can become tools of governance for national solutions. This, 
after all, is probably the point. Yet, if there is an ideological component, it then becomes 
more challenging, particularly when added to the epistemic relativism of self-report- 
ing, to gather meaningful data to implement productive policy. 


7 This present study does not have the capacity to verify whether these tokens trans- 
late into real-world pragmatic abilities to communicate in Sanskrit. 


8 A ‘mother tongue’, is defined in the Indian Census guidelines (3.1) as, a "the lan- 
guage spoken in childhood by the person's mother to the person. If the mother died in 
infancy, the language mainly spoken in the person's home in childhood will be the moth- 
er tongue. In the case of infants and deaf mutes, the language usually spoken by the 
mother should be recorded. In case of doubt, the language mainly spoken in the house- 
hold may be recorded" (Office of the Registrar General 2020). Section 3.2.a-d stipu- 
lates that an 'enumerator' is "Bound to record the language as returned by the person 
as her/his mother tongue"; and that they must record the "mother tongue in full, what- 
ever is the name of the language returned by the respondent and do not use abbre- 
viations". Collectors are not expected to determine if a "language returned by a per- 
son is a dialect of another language"; and, if there is any "relationship between moth- 
er tongue and religion" (Office of the Registrar General 2020). This gets complicated 
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tables exist, as each state has a separate C-16/-17 table. Having col- 
lected and sorted millions of data points, each table was filtered us- 
ing excel data analytic functions. The subsequent aggregate token 
amounts were then cross-analysed over several levels of administra- 
tion. These levels include the national, state, district, sub-district lev- 
els, as well as rural and urban zones and sex ratios. 

A key finding demonstrates high levels of affinity between the tri- 
une of Sanskrit (S), Hindi (H), and English (E), which is also nuanced 
by each state's official language (SOL). This means that people who 
identify as L1-L3 speakers of Sanskrit are overwhelmingly clustered 
within a specific set of L1-L3 languages, which more often than not 
includes Hindi and English.? Therefore, the statistically relevant iden- 
tity of the typical ‘Sanskrit speaker’ is an educated, middle-class ur- 
banite who lives in north India.'? 

Based on 2011's C-16 table, the total nation-wide number of L1-San- 
skrit Persons amounts to 24,821 (tokens); [fig. 1] shows the totals for 
all states and union territories. 


when Hindu nationalist groups urge people to list Sanskrit as their L1 in the lead up 
to each census (Tare 2010). 


9 Thishas much to do with India's language planning policy, which prefers a Sanskri- 
tised Hindi as the official language (see Articles 343 and 351 of India's constitution). 
For example, L1-S L2-H L3-E would mean an individual's languages are L1-Sanskrit, 
L2-Hindi, and L3-English. The following formula S=(Lla L2p L3y)(SOL) is an attempt 
to articulate the ways in which Sanskrit can be found within complex linguistic ecolo- 
gies and the influence of the state language in modifying the triadic SHE. For instance, 
across the language area of Maharashtra, the state language, Marathi, competes, as 
it were, for space on, like different taste buds, on the tongues of the state's citizens. 


10 Inthe post-Independence era, studies related to contemporaneous spoken San- 
skrit were initiated first by the Sanskrit Commission (Azad 1957), which laid down sev- 
eral recommendations for preserving and promoting Sanskrit as a spoken language, 
some of which have been successfully introduced. Deshpande (2011) explains how, due 
to the changes in the three-language educational policy, Sanskrit has fared better in 
the Hindi speaking states than in the non-Hindi speaking states, where a dramatic re- 
duction in students studying Sanskrit occurred once it became optional in 1968 (see 
Azad 1957, 99). This does not stop activists from trying to install Sanskrit as the nation- 
al language (rastra-bhasa) and global lingua franca (visva-bhasa) (Ramaswamy 1999). 
As well, these language politics have long encoded Hindi as a hegemonic language yet 
raise the status of Sanskrit. Complementing this, Babu (2017, 113) explains that in In- 
dia Sanskrit sits atop the linguistic hierarchy and caste system by invoking the notion 
of catur-varna ('four-classes') by positing Sanskrit as occupying a privileged position 
and English (which is a rank outsider in the constitutional scheme), as a language with 
emancipatory capacity due to its positioning outside the legitimised hierarchy. 
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Figure 1 2011 India: States L1-Sanskrit. (India 2020) 


A deeper historical review of Sanskrit's enumeration across all 15 
censuses occurs in [tab. 1], which provides a cursory glance at San- 
skrit’s India-wide total Persons’ results from 1881 to 2011. 


Table 1 Total Persons L1-Sanskrit returns, 1881-2011!! 


Year Total Sanskrit Total population 9o 
1881 1,308 224,000,000 0.0006 
1891 308 287,000,000 0.0001 


11 The Sanskrit data at each census is located in Plowden 1883, 132; Baines 1893, 
144; Risley, Gait 1903, 164, 174; Gait 1913, 106; Marten 1923, 96; Hutton 1933, 492; 
Yeats 1943, 9; Mitra 1994; Office of the Registrar General 1954, 7; Mallikarjun 2002; 
Breton 1976, 304; Office of the Registrar General 2020. Yeats explains that “The lan- 
guage and script questions have not been tabulated and I make now a recommendation 
to the Government of India that they be not tabulated even if the suspended operations 
are resumed" (1943, 9). Mitra (1994, 3207) explains that war, communal tensions, and 
Yeat's transition to self-enumeration from household enumeration resulted in a com- 
pletely botched census that produced incoherent results left unpublished. Mallikarjun 
(2002) shows that of the 2,554, one person identifies as a speaker of 'Vedic' Sanskrit 
and another as a speaker of VedPali, while 93 and 5 people respectively claim to speak 
Pali and Prakrit. Breton (1976, 304-8) provides a good overview of the 1961 census. 
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Year Total Sanskrit Total population % 

1901 716 238,396,327 0.0003 
1911 360 252,093,390 0.0001 
1921 356 251,321,213 0.0001 
1931 1,181 278,977,238 0.0004 
1941 N/A 318,660, 580 N/A 

1951 555 361,088,090 0.0001 
1961 2,554 439,234,771 0.0006 
1971 2,212 548,159,652 0.0004 
1981 6,106 683,329,097 0.0009 
1991 49,736 846,427,039 0.006 
2001 14,135 1,028,737,436 0.001 
2011 24,821 1,210,854,977 0.002 


This paper has three main sections. The first (§ 2) discusses key 
theo-political points related to Sanskrit’s imaginative power. § 3 ex- 
plicates Sanskrit’s national-level enumeration comparing the 2011 
and 2001 censuses. § 4 burrows down to the lowest administrative 
levels of four states to show which districts, sub-districts (tehsil/ta- 
luk, or taluq), and in some cases, blocks, returned the highest num- 
bers of L1-L3 Sanskrit tokens. 


2 Sanskrit, Theo-Politics, and Faith-Based Development 


Inspection of faith-based competitive diplomacy, in relation to Yoga 
and Sanskrit, is sparse.’ The ways in which Sanskrit is imbricated 
is often under appreciated. While for some, Sanskrit might be a dead 
language and a symbol for millennia of oppression, for others it is a 
treasure trove of untapped knowledge that might just save humanity 
and a heritage language one might like to speak (McCartney 2021a). 
Sanskrit helps define and chart one’s path toward a moral horizon. 
This speaks more about temporalities of becoming rather than be- 
ing (Fahy 2020). It helps an individual link to an archaic modernity 
or futured-past and potentially return to an imagined Vedic ‘Golden 
Age’ fuelled by re-enchanting, eco-sustainable, neo-Romantic, mys- 
tical holism (Hebden 2011; Subramaniam 2019). However, Sanskrit’s 
reclamation and acquisition are indelibly constrained by substrate 


12 The most closely related is Jacobs’ documentation (2016) of the world’s largest vol- 
unteer-based charity, the Art of Living Foundation, that originates from India. Watanabe 
(2019) explores a Japanese organisation’s operations in south-east Asia. Haynes’s (2021) 
edited volume explores international relations across several religions. Nelson (2021), 
however, provides the most comprehensive account of faith-based NGOs as non-state 
political and moral actors. 
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interference from the L1s influencing how it, as a L2, is spoken. One 
of the key assets used to justify Sanskrit's role as a tool for develop- 
ment is its perceived linguistic purity. It is argued that only a ‘pure’ 
Sanskrit can deliver the utopian world it inspires. What, exactly, 
might a pure Sanskrit sound like and how might it power anything? 
Answering this question is a particularly vexing matter, especially 
when considering that the earliest layers of the Vedic corpus contain 
hundreds of loan words from other Indo-European and non-Indo-Eu- 
ropean languages.** During the post-Vedic Period (c. 500 BCE), ver- 
nacular Sanskrit, otherwise known as bhasa, was referred to as the 
language of the conquerors (Burrow 1965). Burrow showed through 
a study of the morphology of Classical Sanskrit how the diversity of 
forms prevalent in the earlier Vedic language reduced significant- 
ly, even though it did become the language of Vedic and later Hin- 
du texts, commentaries, rituals, and literature (Subbarao 2008). In 
the contemporary revival movement, this simplification has contin- 
ued to the point where vernacular Sanskrit can be feasibly equated 
with a Sanskritised form of Hindi (Deshpande 2011; unpublished). 
The production, reception, and consumption ofthe 'Sanskrit-speak- 
ing' village narratives across various media appear to function in a 
similar way to phalasruti paratexts. The phalasruti is the textual com- 
ponent listing the benefits of hearing or reciting the particular text. 
Taylor (2012, 94-5) explains that these "fruits of hearing" texts of- 
ten include potential punishments and dangers listed along with the 
promise of heavenly rewards and that this "is a way of enabling the 
discourse to function as 'true' and is at least partly driven by a dis- 
tinctly earthly agenda". Another similarity between Taylor's discus- 
sion of paratexts and the 'Sanskrit-speaking' village discourse is the 
way in which quantifiable and qualifiable measurements seem illu- 
sive. Nonetheless, the 'Sanskrit-speaking' village boots this signal. 
Consider the example of this faith-based development narrative that 
has evolved over the past decade in the state of Uttarakhand, which, 
in 2010, voted in Sanskrit as its second official language (Trivedi 
2010; McCartney 2021c). Even though this project was implemented a 
decade ago, and has endured changing governments and allegations 
of corruption, by 2013 321 crore (USD 275 million) had already been 
spent on promoting Sanskrit education in Uttarakhand. Regrettably, 
there is very little to show for it (Singh 2015). Recently, however, an 
updated policy has increased this imposition of language shift toward 
the target language, Sanskrit (Ahmad 2020). This new policy aims to 


13 Historical linguistics demonstrates that the oldest intangible artefact of the San- 
skrit canon, the Rgveda (c. 1600-1100 BCE), contains approximately two percent non-Ar- 
yan vocabulary, idiomatic expressions and phonemic influences, which are derived from 
the Dravidian language family, the Bactria-Margiana Archaeological Complex, and the 
Harappan Kubha-Vipas Substrate (see Witzel 2010). 
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create a Sanskrit village in every 'block' (an administrative division 
often confused with sub-districts) of Uttarakhand (News Desk 2020; 
Upadhyay 2020). The state of Uttarakhand consists of two Divisions, 
13 Districts, 79 Sub-districts, and 97 Blocks. One wonders how much 
more investment might be needed to transform 97 villages scattered 
across the Himalayas into samskrta-gramah (‘Sanskrit-villages’). Af- 
ter all this investment not one L1-Sanskrit token comes from any vil- 
lage or rural area.** While 70% of the state's total population live in 
rural areas, 100% of the state's total (n2246) L1-Sanskrit tokens are 
linked to urban areas. 

The ideology behind this top-down development project aims to 
reverse engineer India and the world through implementing a 'dhar- 
mic' lifestyle predicated by Sanskrit and Yoga. It aims to reform so- 
ciety toward an imagined Sanskritland, where just like the follow- 
ing song that attendees at Samskrita Bharati language camps learn, 
the aspiration is that Sanskrit will be spoken "grhe grhe [in eve- 
ry home], grame grame [in every village], nagare nagare [in every 
city], and dese dese [in every country]”. This might seem terribly 
banal and optimistically utopian, yet it is part of a yoga-oriented, 
faith-based, competitive diplomatic, soft power initiative. Evidence of 
this includes propositions such that Yoga and Sanskrit can solve cli- 
mate change (Chauhan 2015; King 2015; Jacobs 2016; United Nations 
India 2019; Miller 2020). The final aspiration of Samskrita Bharati, 
however, is evidenced through its road map, which aims to build on 
simple Sanskrit (samskrtam saralam) utterances toward it being spo- 
ken all the time (samskrtam sarvesam). This potentially leads to an 


14 Sanskrit and Yoga are used to brand the nation (McCartney 2021a). The produc- 
tion of legitimacy and authority in diplomatic and economic arenas involves interweav- 
ing narratives involving a product, a place, and a nation (Aroncyzk 2013) through which 
nations work to control their own images by implementing strategic communication 
strategies (Ermann, Hermanik 2018). This narrative is only a few clicks away from con- 
firming one's bias. The following quote, from Soumitra Mohan (2020), encapsulates the 
sentiment around its didactic potential: "The language deserves to be treated much 
better than it has been so far, more so when it has been called the best 'computerable' 
language. Sanskrit's credentials to be a language of future India are definitely better 
and greater than we have realised so far. Its revival will not only renew and revive the 
pride in our own cultural heritage but will also bring about spiritualism and the con- 
cept of a meaningful society and polity, thereby bringing order and peace all across 
the country, a desideratum for any developed society". A similar sentiment comes from 
Sampürnànanda Samskrta Visvavidyalaya (SSV)'s homepage (2019) (https://ssvv. 
ac.in/brief-history). SSV is one of India's best-known Sanskrit universities, which is 
located in Benares, Uttar Pradesh. SSV explains on its homepage that "Sanskrit is the 
most ancient and perfect among the languages of the World. Its storehouse of knowl- 
edge is an unsurpassed and the most invaluable treasure of the world. This language 
is a symbol of peculiar Indian tradition and thought, which has exhibited full freedom 
in the search of truth, has shown complete tolerance towards spiritual and other kind 
of experiences of mankind, and has shown catholicity towards universal truth. This 
language contains not only a rich fund of knowledge for people of India, but it is also 
an unparalleled way to acquire knowledge and is thus significant for the whole World”. 
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unavoidable critical mass (samskrtam anivaryam), which will inev- 
itably lead to a language shift (samskrta sarvatra), and adoption of 
the new global lingua franca (visva bhasa) (McCartney 2017e). The 
perceived net-positive outcome (abhyudaya) has India positioned as 
the next global superpower and paragon of moral virtue, its ultimate 
dispenser (visva-guru) (Bharati 2014; Singh R.K. 2014; Press Trust 
India 2018; 2019). 


3 Comparing 2001 and 2011 Census Results 


Now, itis time for some bookkeeping (pusta palana). This section pro- 
vides a straightforward analysis of the 2001 and 2011 census results. 
The Persons category is further disambiguated by two binaries, Ru- 
ral/Urban and Males/Females. In the 2011 result (n — 24,821), fewer 
L1-Sanskrit tokens were returned from Rural areas (10,908), as op- 
posed to Urban areas (13,913). This gives a ratio of 44:56. However, Bi- 
har completely reverses this with an 89:11 ratio in favour of rural are- 
as. See [tab. 2] for a comparison of the 2011 top ten Rural:Urban states. 


Table 2 Rural and Urban Top Ten States, 2011 (Office of the Registrar General 2020) 


2011 Urban Top 10 2011 Rural Top 10 

Bihar 3,041 Maharashtra 3,555 
Rajasthan 1,461 Uttar Pradesh 1,668 
Uttar Pradesh 1,394 Madhya Pradesh 1,020 
West Bengal 1,066 Karnataka 1,016 
Himachal Pradesh 857 Rajasthan 914 
Madhya Pradesh 851 Tamil Nadu 743 
Goa 385 Gujarat 680 
Odisha 306 Goa 670 
Jharkhand 264 NCT of Delhi 594 
Maharashtra 247 Jharkand 589 


The sex difference splits 55:45. A total of 13,636 Male tokens were 
returned compared to 11,185 Female tokens. Every state/union ter- 
ritory has more Males to Females at a 60:40 average. However, Ta- 
mil Nadu has a 50:50 split and Puducherry (Pondicherry) is the on- 
ly instance where Males are fewer than Females at 29:71 (Office of 
the Registrar General 2020); table 3 compares the L1-L3 (total) ‘San- 
skrit speakers’ between the 2001 and 2011 censuses [tab. 3]. Self-re- 
ported L1 speakers increased from 14,135 to 24,821; the L2 figure 
has stayed almost the same, while the L3 figure has dropped by 48%. 
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Table3 2001-2011 Sanskrit L1-L3 (Office of the Registrar General 2020) 


L1 L2 L3 
2001 14,135 1,234,931 3,742,223 
2011 24,821 1,134,362 1,963,640 
% change 43% -9% -48% 


We can still use these data to locate sub-district administrative zones 
which have the highest numbers. Table 4 elaborates on the previous 
table, demonstrating that even though the total number of L1 rose 
between 2001-11 the total L3 has almost halved [tab. 4]. This sug- 
gests that the total number of L1-L3 for 2011 decreased by 37%, even 
though the 2011 L1 increased by 76%. 


Table 4 L1-L3 Sanskrit 2001 and 2011 (Office ofthe Registrar General 2020) 


2001 Total 2001Total %Change 


L1 14,135 24,821 43% 
L2 1,234,931 1,134,362 -896 
L3 3,142,223 1,963,640 -48% 
L1-L3 4,991,289 3,122,823 -37% 
Male LI 8,189 13,636 67% 
L2 875,107 713,772 -18% 
L3 2,751,121 1,266,098 -54% 
L1-L3 3,634,417 1,993,506 -45% 
Female L1 5,946 11,185 47% 
L2 359,824 420,590 14% 
L3 991,102 697,542 -30% 
L1-L3 1,356,872 1,129,317 -17% 


Another significant point relevant across the L1-L3 range relates to 
the relationship between Hindi, English and Sanskrit. The reason the 
L2-Sanskrit figures are different between table 4 and table 5 is due to 
table 4 consisting of all the L2-Sanskrit speakers [tab. 4]. In contrast, 
table 5 shows only the L1-Hindi L2-Sanskrit and L1-Hindi L2-English 
figures [tab. 5]. This figure is significant, as 9596 of L2-Sanskrit speak- 
ers are L1-Hindi speakers (1,174,019 / 1,234,931). This locates the 
L2-Sanskrit phenomenon within an exceptionally Hindi-centric con- 


15 The Hindi language category consists of 57 sub-languages and dialects (Office of 
the Registrar General 2020). 
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text. This is similar to Breton's (1976, 304) observation based on the 
1961 census, that "Le centre de la connaissance du sanskrit à notre 
époque est de loin la plaine gangétique (Uttar Pradesh: 79,000, Bi- 
har: 29,000, Punjab et Haryana: 20,000, Delhi: 9,000)". However, it 
is important to appreciate that this group only comprises 0.396 of the 
total number of L1-Hindi speakers. The Male figures for both L2-San- 
skrit and L2-English have fallen (24 and 11%) while the Female fig- 
ures have increased (6 and 1996). The Totals have respectively de- 
creased 15 and 1.296. 


Table5  L1-Hindi, L2-Sanskrit/English between 2001 and 2011 (Office ofthe 
Registrar General 2020) 


L TOTAL Male Female 
2001 Sanskrit 1,174,019 830,827 343,192 

English 32,399,287 21,931,407 10,467,880 
2011 Sanskrit 994,863 631,099 363,764 

English 32,018,890 19,592,236 12,426,654 


Table 6 begins with the 2011 Total Persons figures for L1-Sanskrit 
[tab. 6]. Below, in the next part of the table the L1-Sanskrit L2-Hin- 
di English combination equates to 69% of the total L1-Sanskrit_L2-a 
figure. In the bottom part of the table the combined L1-Sanskrit_ 
L2-a L3-Hindi English portion is 7796. These data show the intimate 
relations that Hindi and English have with Sanskrit. 


Table6 2011L1-Sanskrit_L2-[Hindi-English]_L3[English-Hindi] (Office of the 
Registrar General 2020) 


TOTAL Male Female 
L1-Sanskrit 24,821 13,636 11,185 
L2 19,712 11,075 8,637 
L2 Hindi 12,221 6,960 5,261 
L2 English 1,347 727 620 
H+E Total 13,568 7,687 5,881 
H+E 96 69 69 68 
L3 7,910 4,600 3,310 
L3 Hindi 2,267 1,327 940 
L3 English 3,796 2,211 1,585 
H+E Total 6,063 3,538 2,525 


H+E 96 TT TT 76 
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Table 7 below presents data related to L1-Sanskrit L2-a [tab. 7]. It 
shows Hindi's clear L1-Sanskrit L2-a majority with 6296 of the L2 
category. While Hindi is one of India's official languages (along- 
side English, which also ranks high), the data show the intimate re- 
lation with the Hindi heartland. Notice that Female L2-Odia tokens 
were higher than Male (180:132), as well as Konkani (39:37), and 
Others (83:76). 


Table? Li-Sanskrit/L2-a rankings, 2011 (Office of the Registrar General 2020) 


L2 TOTAL Male Female 
Hindi 12,221 6,960 5,261 
Marathi 1,934 1,071 863 
English 1,347 721 620 
Bengali 900 462 438 
Kannada 839 502 337 
Tamil 551 281 270 
Gujarati 365 254 111 
Urdu 325 193 132 
Odia 318 138 180 
Telegu 271 150 121 
Malayalam 163 86 TT 
Konkani 159 76 83 
Others 76 37 39 


Table 8 shows the fluctuations between the dominant Sanskrit states 
by comparing the results from the 2011 and 2001 censuses [tab. 8]. 
What might explain Uttar Pradesh's fewer speakers and Maharash- 
tra's increase? 


Table8 2011 and 2001 State-level L1-Sanskrit Census Results (Office of the 
Registrar General 2020) 


2001 2011 % change 
Maharashtra 408 3,802 832 
Bihar 754 3,388 349 
Uttar Pradesh 7,048 3,062 -57 
Rajasthan 989 2,375 140 
Madhya Pradesh 381 883 132 
Karnataka 830 1,218 47 


16 Note: this list is only a portion of the total. 
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Did all the speakers from Uttar Pradesh move to Mumbai? How can 
the dramatic rise in states like Rajasthan, Madhya Pradesh, and Kar- 
nataka be explained? These data show that the Sanskrit-speaking 
sentiment predominates in the Hindi belt. Having provided some na- 
tional level statistics, what follows is a closer look at these top-rank- 
ing states, pinpointing L1-L3 Sanskrit tokens down to district and 
sub-district administrative levels. This begins in Maharashtra. 


4 An Explication of Census Data in Several States 


4.1 Maharashtra 


Table 9 shows the top ten districts in the state [tab. 9]. Making sense of 
allthe data is challenging because there is an overwhelming amount 
of overlap between tables. For ease of comprehension, table 9 is a 
modified version of the original. Several columns have been delet- 
ed.” Respectively, both Pune District (28%) and Pune City Sub-dis- 
trict (2296) represent the highest L1-Sanskrit administrative zones 
in their category for the state. It is clear from the Urban 3554:Ru- 
ral 145 ratio that 9696 of Maharashtra's Sanskrit tokens are locat- 
ed in urban areas. 


Table9 2011 Maharashtra C-16 Sanskrit-Top 10 Districts (Office of the Registrar 
General 2020) 


C-16 POPULATION BY MOTHER TONGUE 
State Code - 27 / Mother Tongue Code - 17 / Mother Tongue Name - Sanskrit 


District code | Area name Total Rural Urban 

P M F P M F P M F 
000 MAHARASHTRA 3,802 1,984 1,818 | 247 145 102 |3,555 1,839 1,716 
521 Pune 1,091 567 524 51 33 18 1,040 534 506 


17 Asan example, before the token numbers are given, on the right of each table, the 
complete line for Pune District would read, C0116 (table code, denotes that these da- 
ta are from the C-16 ‘mother tongue' table), 027 (state code), 521 (district code), 0000 
(sub-district code), Pune (area name), 017000 (mother tongue code), Sanskrit (mother 
tongue name). This would look like the following, C0116-027-521-0000-Pune-017000- 
Sanskrit. Where it gets confusing is when districts, sub-districts, and towns have the 
same or similar names and coding. For example, the district of Pune (27-521-00000) is 
similar to its sub-district constituent, Pune City (27-521-04194). The same logic applies 
to every other district/sub-district across the entire suite of census tables. Another ex- 
ample is Nashik District (27-516-00000), which has a similarly named sub-district, Na- 
shik (27-516-04152). This raises some methodological issues. We can use these fields 
to find all sub-districts in each state and rank them. We can also filter just the districts 
and rank them in the same way. 
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C-16 POPULATION BY MOTHER TONGUE 
State Code - 27 / Mother Tongue Code - 17 / Mother Tongue Name - Sanskrit 


District code | Area name Total Rural Urban 

517 Thane 710 385 325 14 9 5 696 376 320 
518 Mumbai Suburban | 556 263 293 0 0 0 556 263 293 
516 Nashik 442 239 203 21 12 9 421 221 194 
505 Nagpur 195 98 97 6 6 0 189 92 97 
519 Mumbai 109 54 55 0 0 109 54 55 
511 Nanded 61 30 31 5 3 56 28 28 
515 Aurangabad 55 29 26 5 2 50 26 24 
514 Jalna 51 27 24 4T 24 23 4 3 1 

504 Wardha 42 21 21 22 10 12 20 11 9 


Table 10 shows the L1-L3 relations found in the C-17 tables [tab. 10]. 
In particular, the relations between the state language, Marathi, with 
Hindi, English, and Sanskrit are located. While Sanskrit clusters with 
Hindi and English, at the state level it becomes a bit more complicat- 
ed since itis important to consider the relevant state language in this 
clustering. Hindi and English, however, still have an overwhelming 
presence. The left column shows L1-Marathi at the national level,!* 
the middle column shows how many of these returned L2-Hindi, while 
the right column shows the figures for L3-Sanskrit. This repeats for 
each line demonstrating the L1-Marathi L2-a L3-P scenarios. Using 
the same formula, the second part of the table zooms in to the state 
level figures. L1-Marathi L2-Hindi L3-a equates to 42% ofthe L2 cat- 
egory. The final part of this table focuses on L1-Sanskrit L2-a L3-f. 


Table 10 2011 Maharashtra C-17 Bi-/Trilingualism (Office of the Registrar General 
2020) 


Marathi across India 


L1 L2 L3 

Marathi 83,026,680 Hindi 34,650,142 Sanskrit 57,070 
Kannada 1,468,221 Sanskrit 559 
English 1,395,659 Hindi 870,985 


Kannada 26,907 
Sanskrit 13,142 


Gujarati 361,327 Sanskrit 238 
Khandeshi 292,555 Sanskrit 124 
Konkani 98,318 Sanskrit 168 


18 Notice C17-0000 2011 compared to C17-2700 in the middle and right panels, re- 
spectively. The state code for Maharashtra is 2700 and the national code is 0000. 
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Sanskrit 56,360 Hindi 22,477 
English 19,780 


L1-Marathi in Maharashtra 


L1 L2 L3 

Marathi77,461,172 Hindi 32,660,911 Sanskrit 54,500 
English 1,202,810 Sanskrit 12,673 
Kannada 512,117 Sanskrit 471 
Khandeshi 292,035 Sanskrit 124 
Telegu 150,921 Hindi 31,881 

Sanskrit 38 

Gujarati 60,552 Sanskrit 142 
Sanskrit 54,100 Hindi 21,474 


English 19,415 


L1-Sanskrit in Maharashtra 


L1 L2 L3 
Sanskrit3,802 Hindi 1,782 English 888 
Marathi 365 
Marathi 1,259 Hindi 598 
English 306 
English 334 Marathi 49 


This demonstrates that the L1-SOL L2 L3 cluster with Hindi English 
is a predictor for the relative number of L2 and L3 Sanskrit speakers. 
This is consistent across every national and state-level enumeration. 
If we reverse the order (right panel) and begin with L1-Sanskrit, the 
numbers for L2 L3 also follow a similar order, with the State Offi- 
cial Language (SOL), Hindi, and English in superior positions com- 
pared to other possibilities. 


Figure2 2011 Maharashtra: Districts L1-Sanskrit. (India 2020) 
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Figure 2 demonstrates the 2011 District-level L1-Sanskrit figures for 
Maharashtra [fig. 2]. The top five districts are Pune 1091, Thane 710, 
Mumbai Suburban 555, Nashik 442, and Nagpur 195. 


Figure3 2011Maharashtra: Sub-districts L1-Sanskrit. (India 2020) 


Figure 3 zooms down to the sub-district level to show the top five 
districts in more detail [fig. 3]. Notice how Nagpur District has been 
superimposed in the red box. Next, we move onto the state of Bihar. 


4.2 Bihar 


In Bihar, the highest number of L1-Sanskrit tokens are located in the 
eastern administrative area of Kishanganj District (210). The top five 
districts are listed in [tab. 11]. 


Tableii 2011Top 5 Districts: Bihar (Office of the Registrar General 2020) 


C-16 POPULATION BY MOTHER TONGUE 
State Code - 10/ Mother Tongue Code - 17 / Mother Tongue Name - Sanskrit 


Districtcode Areaname Total Rural Urban 

P M F P M F P M F 
000 BIHAR 3,388 1,845 1543 | 3,041 1,654 1387 | 347 191 156 
210 Kishanganj 1,028 543 485 | 962 507 455 66 36 30 
212 Katihar 604 324 280 598 321 277 6 3 3 
211 Purnia 514 326 248 544 309 235 30 17 13 
209 Araria 383 220 163 381 219 162 2 1 1 
230 Patna 119 67 52 4 3 1 115 64 51 
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From within these districts, Bihar's top ten sub-districts are Dighal- 
bank 558 (Kishanganj District), Terhagachh 307 (Kishanganj Dis- 
trict), Baisi 246 (Purnia District), Azamnagar 154 (Katihar District), 
Sikti 120 (Araria District), Dagarua 108 (Purnia District), Palasi 105 
(Araria District), Falka 92 (Katihar District), Thakurganj 83 (Kis- 
hanganj District), and Patna Rural 82 (Patna District). In [figs 4-5] no- 
tice the clustering in the far east of the state across four districts. 


Figure4 2011Bihar: Districts L1-Sanskrit (India 2020) 


Figure5 2011 Bihar: Sub-districts L1-Sanskrit (India 2020) 


Unlike Maharashtra, 9096 of Bihar's L1-Sanskrit tokens are Rural. 
Where might they be located? The 'village-level' C-16 tables are not 
available. One cumbersome triangulation method requires scrutinis- 
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ing other tables. The state's highest returning district is Kishanganj. 
Beginning with the 'town level' C-16 table for Bihar, we learn that 
Kishanganj Nagar lists 66 Urban L1-Sanskrit tokens (Office of the 
Registrar General 2020). We can combine this with the Village Di- 
rectory of Kishanganj District, Bihar (Office of the Registrar Gener- 
al 2020).?? The relevant code for Dighalbank's Block is 002. We can 
sort and filter all the listed towns and villages with this code and then 
sort further for all the places with a population under 5,000 inhabit- 
ants (the town/village population boundary). While we currently are 
unable to locate with more accuracy, we are, nonetheless, left with a 
list of 'rural villages' from which Bihar's highest L1-Sanskrit sub-dis- 
trict is potentially comprised of. Let us move on to Madhya Pradesh. 


4.3 Madhya Pradesh 


Madhya Pradesh returned more L1-Sanskrit Urban tokens (5496) [Ru- 
ral 851 (451 M/400 F) and Urban 1020 (537 M/483 F)]. This further 
complicates the fabled 'Sanskrit-speaking' village narrative. How- 
ever, the highest-ranking sub-district, Pipariya, accounts for 9696 
(Rural 485/Urban 5) of Hoshangabad District's total 524 (496 Ru- 
ral/28 Urban) (Office of the Registrar General 2020). Compared to 
the state Rural total, Pipariya Sub-district equates to 5796 of all the 
367 sub-districts in Madhya Pradesh and 2696 of the state's total. At 
the village level this can be narrowed down to 114 villages in Pipari- 
ya Sub-district (Office of the Registrar General 2020). 

Ifthis sub-district does have such a high number of Sanskrit speak- 
ers it is certainly unclear as to why these places are not as famous 
as the three so-called 'Sanskrit-speaking' villages (Jhiri, Sarangpur 
Sub-district, Rajgarh District; Mohad, Gadawara Sub-district, Narsim- 
hapur District; and Baghuwar, Kareli Sub-district, Narsimhapur Dis- 
trict). What is more curious is the fact that the districts these three 
'Sanskrit-speaking' villages are located in barely returned any L1-San- 
Skrit tokens. The internet contains countless sites that assert that eve- 
ryone, or nearly everyone, in these villages speaks Sanskrit. The dis- 
trict and sub-district L1-Sanskrit tokens are represented in [figs 6-7]. 


19 Kishanganj District is comprised of seven 'blocks', which is the same number of 
sub-districts. The difference between a block and a sub-district is its function. A block 
is a geographical unit for rural development, whereas a sub-district (tehsil) is a geo- 
graphical unit for revenue collection (Maheshwari 1984). 
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Figure6 2011 Madhya Pradesh: Districts L1-Sanskrit. (India 2020) 


Figure 7 2011 Madhya Pradesh: Sub-districts L1-Sanskrit. (India 2020) 


Like Bihar, filtering through the village/town lists for Pipariya 
Sub-district the towns with 5,000+ inhabitants can be removed. 
The result is 142 villages, ranging in size from 4,767 (Khapar Khe- 
da) down to two (Pathi Thekredri) inhabitants. It is this list that the 
majority of Madhya Pradesh’s Sanskrit tokens come from (Office of 
the Registrar General 2020). 

The ST-15 (Scheduled Tribes) table is a subset of the C-16 tables 
relating to Scheduled Tribes’ mother tongues. These are available at 
the state/district levels. 6.7% of the L1-Sanskrit state total (149/1871) 
is comprised of people who identify as members of Scheduled Tribes. 
Of this, 126 tokens are from Hoshangabad District (Office of the Reg- 
istrar General 2020). 
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Tablei2 2011 and 2001 Madhya Pradesh L1-Sanskrit (Office of the Registrar 
General 2020) 


Total Persons - Sanskrit L1 L2 L3 
2011 1,871 246,940 454,245 
2001 381 210,400 960,176 
% change 391 17 -53 


Table 12 scratches beneath the surface to show the changes between 
2001 and 2011 across the L1-L3-Sanskrit range [tab. 12]. This figure 
reflects the national data. As we have discussed, the L1 change is 
aspirational. It really makes not much sense, otherwise. How could 
a 391% increase have occurred in just one decade? The L2 figure is 
more feasible, however the L3 decline of 5396 is a worrying predic- 
tor for the vitality of Sanskrit. Could it be that all the Sanskrit village 
mythology circulating might be harming Sanskrit's role? In relation 
to the Hindi/English/Sanskrit trinity, in Madhya Pradesh L1-H L2-E 
L3-S comprises 9896 of all the L3-Sanskrit possibilities. This decline 
is also considerable among the Scheduled Tribes of Madhya Pradesh. 
Table 13 highlights the changes of L1-Bhili/Bhilodi and L1-Gondi 
L2-a L3-Sanskrit [tab. 13]. Both have declined dramatically between 
2001 and 2011. 


Tablei3 2011and2001Madhya Pradesh L1-ST_L2-a_L3-Sanskrit (Office of the 
Registrar General 2020) 


2011 2001 9o change 
L1-Bhili/Bhilodi_L2-a_L3-Sanskrit 779 1,677 -54 
L1-Gondi L2-a L3-Sanskrit 661 889 -26 


Madhya Pradesh's 2011 L2-Sanskrit total is 246,940. This is com- 
prised of the Scheduled Tribes' total of 13,540 (596 of state total). 
This 13,540 is predominated by 8496 Rural tokens. Of this Scheduled 
Tribes' total 5296 come from a group of 60 ST-L1 languages, including 
Gond, Arakh, and Agaria (prevalent in Hoshangabad District). The 
next group of languages include Bhil and Bhilala (1596), and Kol (1096) 
(Office of the Registrar General 2020). Having located the districts 
and sub-districts within Madhya Pradesh where L1-L3 Sanskrit to- 
kens were returned, a clearer image of where future fieldwork could 
be directed emerges. The final state is Uttar Pradesh. 
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4.4 Uttar Pradesh 


Uttar Pradesh is India's most populous state with 199,581,477 people 
(Office of the Registrar General 2020). While Uttar Pradesh experi- 
enced a 5796 decrease in L1-Sanskrit tokens, in comparison, the total 
nationwide increase for L1-Sanskrit is 7696 (Government of Office of 
the Registrar General 2020). What has caused Uttar Pradesh to de- 
crease when other states witnessed large increases? Three districts 
are worthy of closer scrutiny because of how they changed dramat- 
ically between 2001 and 2011. The state capital, Lucknow, reduced 
by -8296, while Unnao reduced by -9596, and Gorakhpur, the elector- 
ate of UP's Chief Minister, Yogi Adityanath, decreased by -99% (Of- 
fice of the Registrar General 2020). The top ten districts are Kanpur 
Nagar 932, Sitapur 722, Sultanpur 323, Ghaziabad 128, Saharanpur 
85, Ballia 79, Lucknow 55, Varanasi 55, Bijnor 50, and Agra 41 (Of- 
fice of the Registrar General 2020); figure 8 provides the total Per- 
son numbers for each district [fig. 8]. 


Figure8 2011UttarPradesh: Districts L1-Sanskrit. (India 2020) 


Figure 8 shows the numbers for each district in which L1-Sanskrit to- 
kens were returned. In Uttar Pradesh, 1,697 Male L1-Sanskrit tokens 
were returned compared to 1,365 Female. Like Madhya Pradesh and 
Maharashtra, the Urban L1-Sanskrit (1668) token count is slightly 
higher than the Rural equivalent (1394). The urban area of Varanasi 
(Benares), which is considered one of the holiest cities of Hinduism, 
which is home to the famous Sanskrit university, Sampurnananda 
Samskrta Vi$vavidyalaya (SSV), only returned 54 tokens (36 M/18 
F). Similarly, Breton (1976, 306) wonders why the famous tradition- 
al seats of Sanskrit learning have fewer speakers of Sanskrit. No- 
tice that Kanpur is the city with the highest number of returned 
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L1-Sanskrit tokens (Office of the Registrar General 2020). The top ten 
sub-districts are Kanpur 931, Mahmudabad 482, Lambhua 310, Gha- 
ziabad 127, Sidhauli 94, Biswan 83, Ballia 62, Varanasi 55, Deoband 
51, and Bijnor 50 (Office of the Registrar General 2020). Kanpur is 
also the highest ranked town. Zooming in, figure 9 shows the unique 
L1-Sanskrit results for each of the six tehsils in Sitapur District [fig. 9]. 


Figure9 2011 Sitapur District within Uttar Pradesh. (India 2020) 


In 2001, Sitapur District, Uttar Pradesh (430 kilometres east of New 
Delhi), reportedly had the highest number of all the districts in the 
country, at 558 (Priyanka 2014). The 2011 total of 722 (M 378/F 
344) is a 2896 increase. Sitapur District also represents 2496 of the 
state's total. It splits Rural 1275 (M 676/F 599) and Urban 73 (M 33/F 
40). This means that the district of Sitapur provides 4696 of Uttar 
Pradesh's L1-Sanskrit tokens, which, itself, is comprised of 9096 Ru- 
ral from across the district's six tehsils (Office of the Registrar Gen- 
eral 2020). When compared with the 2001 results for the same the 
district the fluctuations are quite remarkable. Where did 2,604 San- 
Skrit 'speakers' from Biswan tehsil go? Sitapur District has the sec- 
ond highest L1-Sanskrit number in Uttar Pradesh. 
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Figure10 2001 and 2011 Sitapur District: Sub-districts L1-Sanskrit. (India 2020) 


Figure 10 shows the 2011 figures for each tehsil of Sitapur District, 
which indicates that Mahmudabad Tehsil has the most with 482 
(M 258/F 224) [fig. 10]; table 14 displays the top five sub-districts 
[tab. 14]. Three of the list are located in Sitapur Tehsil and Biswan 
Town is the sixth-highest ranked in the state, as well as the fifth-high- 
est ranked tehsil. Both Biswan and Sidhauli tehsils return more Fe- 
male than Male tokens. While Lambua returns the same number for 
both sexes. 


Table 14 2011 Top 5 Rural Tehsils (Office of the Registrar General 2020) 


Tehsil Total Male Female 
Mahmudabad 482 258 224 
Lambhua 310 155 155 
Sidhauli 94 49 45 
Ballia 62 32 30 
Biswan 83 39 44 


In contrast to Kanpur, Mahmudabad Tehsil (Rural) is the second high- 
est tehsil in the state with 482 (258 M/224 F). If all the L1-Sanskrit 
‘speakers’ do not live in towns then is it safe to assume they live in 
villages? Mahmudabad Tehsil has 341 villages (Office of the Regis- 
trar General 2020). The question is, in which villages are L1-San- 
skrit ‘speakers’ living? 
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Tablei5 2011 Religious-Political affiliations and Literacy-Sex ratio comparisons 
(Office of the Registrar General 2020) 


State Tehsilname Population Literacy Sexratio Political preference Hindu% 


MP Pipariya 181,261 63.8 896 BJP 95 
MP Sarangpur 186,082 56.4 950 BJP 84 
UP Kanpur 3,470,334 72.8 860 BJP 83 
MP Indore 2,389,511 73.6 925 BJP 74 
UP Mahmudabad 596,252 47.1 884 Samajawada 74 
MP Huzur 2,107,523 72 920 BJP 72 


Finally, table 15 lists some of the main tehsils mentioned above 
[tab. 15]. They are ranked, first, by the percentage-age of Hindus in 
each tehsil. Most display a preference for the Bharatiya Janata Party 
(BJP), regardless of the size of the Hindu majority. The curious thing 
is that some ofthe highest L1-Sanskrit tehsils in Madhya Pradesh and 
Uttar Pradesh return literacy and sex ratios well below national and 
state averages. What does this tell us about the ability of Sanskrit 
to 'transform lives' and the development grand narrative it serves? 


5 Concluding Remarks 


This paper compares the 2001 and 2011 Sanskrit census results pay- 
ing closer attention to the top-ranking states Maharashtra, Bihar, 
Madhya Pradesh and Uttar Pradesh. However, the results of this 
study are unexpected. Further analysis across the remaining states 
will be released in forthcoming articles. Still, we are able to deter- 
mine that Maharashtra returned the highest number for L1-Sanskrit; 
that most of the L1-Sanskrit speakers live in urban areas; that the 
'Sanskrit-speaking' village is an aspirational myth not borne out in 
the government data; that more men claim to speak Sanskrit; that 
the overwhelming majority also speak Hindi and English, regard- 
less of L1-L3 combinations; that the L1-Hindi/L2-Sanskrit combina- 
tion amounts for 9296 of the total L2-Sanskrit speakers; and that, re- 
gardless ofthe rhetorical and ideological bluster, Sanskrit continues 
to fall short of its alleged capacity for attaining the #Sanskrit4Cli- 
mateAction goals related to key indicators, such as sex, literacy, and 
health development. 

Sanskrit, alongside Yoga, is a key instrument for branding the nation. 
This is a domestic as well as international project that has a symbiot- 
ic relation to the global wellness and leisure markets, which is fertile 
ground for cultivating banal nationalism. From a language acquisition 
perspective, the media's preference to provide pithy and inaccurate da- 
ta from the census seems counterproductive, if not misplaced. While 
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moods certainly lift with hearing the L1 level rose between 2001 to 
2011, the more interesting categories relate to L2 and L3 levels, both 
of which have reduced, however the L3 level fell by almost 9096. 

With this finer-grained analysis, a clearer map of where people 
who have an affinity to identify as L1-L3-Sanskrit speakers were lo- 
cated at the last census. Regardless of whether they do in fact speak 
Sanskrit, these data will aid future research related to in-country 
field work, enabling strategic sorties down to the sub-district tehsil 
level. Finally, the 2021 Indian census will the first digitised cen- 
sus the nation will embark on. Hopefully, this allows for data to be 
enumerated, rationalised, and published much sooner than the sev- 
en-year lag that occurred at the last census. Unlike the botched 1941 
census, which first introduced self-reporting of data, it is anticipated 
that this new age of demography in India will not suffer the same fate. 
From a linguistic perspective, building on demographic data potenti- 
ates future exploration of various aspects of language revitalisation 
and second-language acquisition through in-depth study of particu- 
lar linguistic features related to substrate interference and imper- 
fect learning of the target language. 
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1  AGlanceatthe Literary Success of the Pali Milindapariha 


I venture to think that the ‘Questions of Milinda' is undoubtedly 
the master-piece of Indian prose; and indeed is the best book of its 
class, from a literary point of view, that had then been produced in 
any country. (Rhys Davids 1890, XIVIIT) 


Thomas W. Rhys Davids wrote these words in his introduction to the 
first full English translation of the Pali text called Milindapanha 'Ques- 
tions of Milinda’.* At that time, he did not know of the existence of 
a Chinese rendition of the text called Nàxian biqiu jing Ist tk E% 
(T 1670 versions A and B), which would correspond to the Sanskrit 
*Nagasenabhiksusutra ‘the Sutra of the Monk Nagasena’.’ Interest- 
ingly enough, Rhys Davids probably bestows to the text more impor- 
tance than that granted by the Buddhist tradition itself. Rhys Davids 
was writing “from a literary point of view”, but the importance of the 
text from a doctrinal perspective surely deserves equal attention. The 
significance of the Milindapafiha within the Theravada tradition is, in- 
deed, quite controversial. The Milindapanha, together with texts such 
as the Nettippakarana and Petakopadesa, is regarded as a canonical 
text which is part of the Khuddakanikaya only by the Burmese tradi- 
tion.? However, it was considered important enough to be quoted as 
an authority in Pali commentaries: some of them even define it as a 
sutta (= sutra),’ in line with its nomenclature in the Chinese versions 


An earlier version of this paper entitled "Remarks on the Pali and Chinese Versions of 
the Buddhist Milindapafiha” was presented for the first time at the international con- 
ference Il re ellenistico e il saggio indiano. Il Milindapafiha e il suo contesto / The Helle- 
nistic King and the Indian Wise Man. Putting the Milindapafiha in Its Context, University 
of Bologna, 19-20 September 2019. I am grateful to Alberto Pelissero and Saverio Mar- 
chignoli for their remarks on the first draft. I should also thank Ven. Bhikkhu Analayo 
and Giuliano Giustarini for having sent me valuable bibliographical sources to consid- 
er; Kenji Takahashi for the help in dealing with a Japanese source; and the two anony- 
mous reviewers for their detailed feedbacks. It goes without saying that all remaining 
errors are my own responsibility. All translations from Pali and Chinese are my own 
unless otherwise noted. 


1 The Pali title of the text, as known today, seems to be due to the editorial choice 
made by Trenckner (1880, VI) in the first full edition of the text. A variety of titles in 
some manuscripts is provided by Ooi (2021, 181-4). It is also worth highlighting that a 
thorough revision of the current editio princeps made by Trenckner in light of the Sia- 
mese edition of the text is still a desideratum and a worthwhile future task to accom- 
plish. In this regard, see Skilling (2010) and Ooi (2021, 174). 


2 It would seem that we can say either that the two Chinese versions "stem from a 
single original rendition" (Analayo 2021a, 15) or, in other words, that "the two extant 
Chinese versions are the same work, one simply an amplification of the other" (Lev- 
man 2021, 108). 


3 See Norman 1983, 31 and Allon 2018, 237-8. 
4 See Nidd-a I, 166 = As, 108 quoted and discussed by Mori 1997-98, 297-8. 
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(ing #).° Then, was this text an important text or not? We can say 
that it was important enough to reach us, a fact that should not be un- 
derestimated since the method of transmission at that time involved a 
great deal of effort. Another Theravada text called *Vimuttimagga, for 
example, disappeared in the Indian mainland and Sri Lanka and has 
reached us in its entirety only in its Chinese translation (Jiétuo dao lun 
fi Di; T 1648). Now, we are legitimised to wonder for whom and 
why this text was important. As it was theorised since the nineteenth 
century, the text is a product of Northwest India (Trenckner 1880, VIT) 
and, as recently sustained by Stefan Baums (2018, 42), was created 


for the conversion of an audience that was neither Indian nor Greek, 
but part of the cosmopolitan melting pot of Gandhara that was In- 
dianized enough for the literary form of the Questions to appeal 
to it, Hellenized enough to be persuaded by its Greek style of ar- 
gumentation and worldly enough to identify with the figure of the 
most famous foreign ruler of Gandhàra as he undergoes conver- 
sion to Buddhism. 


Some elements could additionally lead us to hypothesise that it was a 
text created by monks for lay people or for people unfamiliar with Bud- 
dhism (the text might also have had the purpose of evangelising).5 In 
this regard, it is interesting to consider the comparison made by Rhys 
Davids (1894, XX-XXVII) between the Milindapafiha and Kathavatthu. 
Both texts deal with controversial points in Buddhism, but they do it 
in quite different ways. It is worth mentioning the words of Rhys Da- 
vids (1894, XXVI): 


the controversy in the older book [i.e. Kathavatthu] is carried on 
against members of the same communion, whereas in the Milinda 
we have a defence of Buddhism as against the outsider. The Kathà 
Vatthu takes almost the whole of the conclusions reached in the 


5 The character jing # does not purely mean sutra but generically means 'scrip- 
ture’ (or originally, ‘classic’, including the many non-Buddhist classics revered in Chi- 
nese tradition). As such it was also used to translate the Indian word sütra/sutta, but 
it does not necessarily point to this as the underlying term. Instead, it was often add- 
ed by Chinese translators whether or not it had a counterpart in the title of the un- 
derlying Indic-language text. The Pali commentaries' evidence might tip the scales 
in favour of the assumption that jing # here actually means sütra/sutta and, indeed, 
*Nagasenabhiksusutra has been a widespread reconstruction for the original Indian ti- 
tle of the text (e.g. Nanjio 1883, 304; Thich 1964; Guang 2007; 2008; 2009). For the sake 
of the present study, I will adopt this last rendering as a scholarly convention, bearing 
in mind the complexity behind this issue. 


6 Similarly, Levman (2021, 113, 125) suggests that the Milindapariha was a sort of Bud- 
dhist 'catechism', implying with this term that the text was orally transmitted (from 
the Greek meaning of katechízein ‘to instruct orally’), a fact that, in my opinion, we 
should be cautious to endorse. 
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Milinda for granted, and goes on to discuss further questions on 
points of detail. 


It is clear that the Milindapanha deals with topics that are more ba- 
sic than the ones treated in the Kathavatthu. Moreover, we also know 
that the Milindapanha refers silently to many Buddhist texts (Rhys 
Davids 1890, XXVII-XXXI), and more directly to others (Rhys Davids 
1890, XXXI-XXXVI).' Therefore, on the one hand it seems that the 
Milindapanha was composed by people that were erudite in the Bud- 
dhist doctrine;* on the other hand, the topics treated are not sophisti- 
cated disputes on minor issues (the kind of things that would interest 
scholar monks), but concepts that are at the very core of Buddhism 
(e.g. anatman, rebirth, karman, Buddha, nirvana etc.) and that would 
interest a person who knows a little about Buddhism.’ This is also re- 
flected by the history of the text in China. According to the conclusion 
reached by Guang Xing (2009), the text was translated into Chinese in 
a very early stage and so it seems to be among the first Buddhist texts 
arrived in China.*° This is informative about the nature of the text, 
which, evidently, was able to satisfy the expectations and needs of a 
Chinese audience. In this regard, Kogen Mizuno (1982, 46) wrote that 


Although many Chinese were curious about Buddhism and were 
interested enough in the sutras to want to study them, they could 
not really comprehend the alien Buddhist doctrines or philosophy; 
thus they read primarily the general moral teachings and stories 
that neither contain technical terms nor expound doctrine. Those 
simple teachings and stories, presented in ordinary language, were 
comprehensible, interesting, and useful. 


Therefore, it seems that the Milindapanha’s fate was governed by the 
fact of being both a simple text and a text that can appeal a huge audi- 
ence. This may be the past and present's good fortune of this oeuvre, 
a text that was also adopted by the Theravada Buddhist school. This 


7 See also the updates provided by Horner (1969, XI). 


8 Horner writes that the Milindapariha "has a wide range and covers much ground, 
denoting deep erudition on the part of its compiler" (1969, IX). 


9 This, according to Thich (1964, 32), seems to be especially true for the first sections 
of the Milindapafiha (following the Pali version's division into seven sections), whereas 
the last three sections are more sophisticated than the previous ones. 


10 Notably, the Gandhara region - from which the Milindapafiha is supposed to have 
originated - was a pivotal area for the transmission of Buddhism in central Asia and Chi- 
na. In this regard, see Neelis 2011, 42-7, 229-56. As reported by Richard Salomon (2018, 
26) "[rlecently a few small fragments have been discovered of a Gandhari text that 
has some resemblance to the Questions of Milinda, including a reference to Nagasena, 
but they seem to belong to some related tradition rather than to the Questions itself". 


114 


Bhasha e-ISSN 2785-5953 
1,1,2022, 111-132 


Bryan De Notariis 
The Buddhist Text Known in Pali as Milindapafiha and in Chinese as Nàxiàn biqiü jing 3I EE 


tradition preserved a Pali version, which, however, is longer than the 
Chinese ones. The latter show an earlier stage of development of the 
work and cover only Mil, 1-89, leaving the remaining part (Mil, 90-420) 
without any other parallel. The Pali version is not only longer, but is 
also ‘Theravadised’ (may the reader forgive my neologism),** although 
it maintains some odd passages which are clearly referring to the doc- 
trines of other Buddhist traditions, such as the Sarvastivada school.!? 
For some reasons, the Pàli version had more popularity in the West 
than the Chinese versions (Guang 2008, 237), and so it would make 
sense, not only now but also in the future, to further investigate the 
Chinese versions in order to shed new light on such an amazing, and 
to some extent unique, piece of literature. 


11 A good example to demonstrate that the Theravada school modified the text is the 
presence of the concept of bhavariga in the Milindapafiha (Mil, 299-300). The term is, 
indeed, peculiar to the Theravada tradition and is found primarily in the Pali texts. As 
stated by Kim (2018, 754), Vasubandhu also wrote within his *Karmasiddhiprakarana 
(Dàshéng chéngyé lun KE; T 1609) that bhavanga (yóufen shi #4}i) originated 
among the Tamraparniya(-nikaya) (= chitóngyé 358) (the full passage reports: JR S$ 
BBE HEAR; T1609.31.0785a14). Kim also writes that "Tamraparniya refers to, or 
is at least closely related to Sri Lankan Theravada tradition" (2018, 754). It could be of 
some interest here to highlight that the manuscripts used by Trenckner for his edition of 
the Milindapafiha were mostly copied in Sri Lanka (Trenckner 1880, III-VII). Another el- 
ement of the Theravada within the Milindapafiha is the interpretation of the term kappa 
'aeon' in connection with the possibility on the Buddha's behalfto extend his life through 
the iddhipada 'the foundation of psychic power'. In the canonical literature it is written 
that "anyone who has cultivated the four iddhipadas, who has practiced them assidu- 
ously, mastered them, made them as a base, established them, become acquainted with 
them, properly undertaken them, he can last, as he wishes, for a kappa or what remains 
of a kappa" (yassa kassaci cattaro iddhi-pada bhavita bahulikata yanikata vatthu-kata 
anutthita paricita susamaraddha, so akankhamano kappam va tittheyya kappavasesam 
và; D, II, 103). In this passage, the term kappa is interpreted, according to the Pali com- 
mentarial literature, as the ayu-kappa, i.e. the ‘life-span’ (ettha ca kappan ti ayu-kappam; 
Sv, II, 554), whereas the interpretation of the term kappa as indicating a maha-kappa 
'cosmic aeon' would seem the right one (Gethin 2001, 94-7). The fact that the Milinda- 
pafiha (see Mil, 141), in the same context, also sustains the reading dyu-kappa may be 
an indicator of the Theravada's influence. Finally, as highlighted by Thich (1964, 23), it 
is worth noting that the Pali version mentions the names of Theravàda Abhidhamma. 


12 There is, indeed, a mention of the existence of three times in the Milindapafiha (see 
Mil, 49-50), a clear sign of the Sarvastivada’s influence (see also Guang 2008, 239). An- 
other of Sarvastivada’s characteristics is found in Mil, 268-71, in which nibbana (San- 
skrit: nirvana, the well-known ultimate goal of Buddhists) and akasa (= Sanskrit: akasa 
‘space’) are described as akammaja ‘not born of kamma’, ahetuja ‘not born of cause’, anu- 
tuja ‘not born of physical change’. This would remind the Sarvastivada’s tenet accord- 
ing to which only akasa and two kinds of nirvanas are considered asamskrta 'uncon- 
structed', whereas for the Theravada tradition the nibbana only is considered as such 
(see Lamotte 1988, 609-10; Horner 1969, XVIII). Finally, another point that differs from 
the orthodox Theravada tradition is the eight investigations (attha mahavilokanani) at 
Mil, 193, because these investigations are only five in the Pali commentaries (Horner 
1969, XVI-XVII). 
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2 Western Reception of the Chinese Versions 
and the Problem of the Archetype 


A comparative study of the P[ali] Milindapanha and the C[hinese] 
Na-hsien-pi-ch' iu ching shows clearly that both versions derive 
from the same source as they have many points in common be- 
tween them [...] the trend of the dialogues is almost identical, the 
dialogues veer round the same theme, with unimportant divergenc- 
es scattered unevenly. (Thich 1964, 1) 


[A] detailed comparison of the Chinese and Pali texts does not sup- 
port translation from a common text, as the vast majority of the 
translations are quite different, being not literal but paraphrases; 
the overall content is generally held in common, but the details of 
the similes are often quite different. (Levman 2021, 113) 


Asit might be noted from these quotations, scholars can have different 
inclinations concerning earlier sources underlying the extant versions 
of our text, being them either a common ancestral source or even the 
Urtext. The resulting judgment might seem prima facie based on either 
giving pre-eminence to similarities or differences. However, the recog- 
nition that there is something in common leaves little doubt. Here, this 
section aims not to establish a definitive answer to the conundrum of 
the existence of an archetype, but is of a more modest scope. Follow- 
ing the introduction of some historical data on the Chinese versions of 
the text, their Western reception is analysed, highlighting how, since 
the very beginning, some scholars showed a certain anxiety in estab- 
lishing which one among the versions is closer to the original. This 
quest prompted the establishment of a methodological approach, ex- 
emplified by some applicable guidelines suggested by Gérard Fuss- 
man. A corollary to a guideline will be proposed and to prove its use- 
fulness an eristic dialogue shared by all versions will be analysed.” 


2.1 Datingthe Chinese Versions of the Text 


The text in its Chinese versions is called Nàxian biqiù jing (JI ^6 EE 
FA; T 1670) - which would correspond to the reconstructed San- 
skrit form *Nagasenabhiksusutra - and the compilers of Chinese cat- 
alogues of Buddhist scriptures ascribed it to the Eastern Jin dynas- 


13 Iusethe term 'eristic' to describe the dialogue that will be analysed because the 
disputers will exploit the ambiguity of words to win in the debate, rather than using 
logic to approach a more objective truth. 
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ty (A; 317-420 CE). As this date was established in retrospect by 
later catalogues, it would make sense to also consider the Japanese 
scholarship, according to which the text was translated into Chinese 
no later than the third century CE, probably around the second cen- 
tury.!5 The text has been handed down to us in two versions, A and B, 
whereas another lost version was translated into Chinese around the 
third century (Demiéville 1924, 8, 21; Fussman 1993, 67). In the West- 
ern academic environment, the text established itself gradually, part- 
ly obscured by the success of its counterpart in Pali. 


2.2 Western Reception 


In his A Catalogue of the Chinese Translation of the Buddhist Tripitaka, 
Bunyu Nanjio (1883) cautiously wrote that the *Nagasenabhiksusutra 
“seems to be a translation of a text similar to the Milinda-panho, 
though the introductory part is not exactly the same as that of the 
Pali text” (304). Nanjio was cautious in his statement since he com- 


14 See Demiéville 1924, 9, 21; Fussman 1993, 66-7; Guang 2009, 227. It is not entirely 
clear to me the reason why von Hinüber (1996, 83) dated the translation of the text to 
the fourth century despite relying upon Demiéville, who reported the Eastern Jin dynas- 
ty (R; 317-420 CE) as the period of its translation, namely the fourth and, potential- 
ly, fifth centuries (Demiéville 1924, 21). Norman, who also was relying on Demiéville, 
wrote that "[t]here is also a Chinese version, which can be dated to a time earlier than 
the fourth century A.D." (1983, 111). It seems likely that Norman was considering the 
existence of the lost version. 


15 “Modern Japanese scholar Mizuno Kogen argues convincingly that the Nàgasena 
Bhiksu Sütra was translated into Chinese in Latter Han dynasty (25-220 CE), not lat- 
er than San Guo Dynasty [Three Kingdoms Period] (220-280 CE), even conservatively" 
(Guang 2009, 236). Guang bases his statement on Mizuno 1959, 29-33. In an English 
work of Mizuno it is written that he dates the Chinese translation around 200 CE (Mi- 
zuno 1982, 196), whereas de Jong (1996, 383) reports that Mizuno dates the transla- 
tion around the second or third century CE. Mori (1997-98, 292 fn. 3) and Thich (1964, 
104-5) also follow Mizuno's study. Given that the article by Mizuno is written in Japa- 
nese, I asked a Japanese colleague, Dr. Kenji Takahashi, to check the relevant pages for 
me and he kindly sent me the following Japanese quotation with its translation: "To give 
a conclusion first, considering the various points that I will describe in what follows, I 
[argue] that the translation of this text/sütra is much older than Eastern Jin (K$) pe- 
riod and should be placed during the period of the Later Han (1%) and that at the lat- 
est it is not later than the Three Kingdoms (=|) period" (ssl zx AE, rici S 5 
TRE A: OD BUC. EOS OREI HORE bAT, KERTET, 
BEC Lb SME FEO CIRVEVI DE CHS, ; Mizuno 1959, 30). It goes without 
saying that a thorough examination of Mizuno’s findings and Japanese scholarship in 
general would be of great benefit for future studies. At the moment, many scholars (in- 
cluding myself) can only rely on second-hand reports. The reasons adduced by Mizuno 
to predate the text seem to be stylistic in nature as according to Guang (2009, 236-43), 
Mizuno provides three reasons to support his argument: 1) the terminology used in the 
*Nàgasenabhiksusütra is comparable with the translation made by An Shigao “tit; 
2) some proper terms and pronouns are quite archaic and were often used during the 
Han dynasty; 3) the gathas were translated into prose. 
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pared the Chinese rendition with the translation provided by Vilhelm 
Trenckner in his Pali Miscellany (1879), that reported a specimen of 
the Pali version of the text. The Pali text was edited in full by Trenck- 
ner only in 1880 and the first complete English translation was made 
by Thomas W. Rhys Davids only in 1890. For a full recognition of the 
Chinese versions, we should wait until 1893, when Edouard Specht 
and Sylvain Lévi clearly identified the two Chinese translations as a 
parallel text to the Pali Milindapafiha or, to be more precise, as differ- 
ent recensions of a text of which the Pali version represents only one 
recension (Specht, Lévi 1893).'5 The discovery of the Chinese trans- 
lations of the Milindapafiha influenced the question about the exist- 
ence of an original text. Trenckner already believed that the Pali ver- 
sion was a translation from another text: 


It [i.e. Milindapafiha] must have been imported from northern In- 
dia, where alone the name of the conqueror [i.e. Milinda] can have 
been preserved. In all probability the original was in Sanskrit, and 
our text is a translation. (Trenckner 1880, VII)! 


However, after the discovery of the Chinese translations, it is possible 
to wonder which one among the Pali and Chinese versions is closer to 
the original work or if it is possible to recover an archetype comparing 
the recensions.'? Just one year after the publication of Specht and Lé- 
vi's article, Rhys Davids published the second volume of his Milinda- 
panha’s translation, taking into account the existence of the Chinese 
versions. He had the feeling that Specht and Lévi wanted to demon- 
strate that the Chinese versions were older recensions than the one in 
Pali. Conversely, Rhys Davids seems to suggest that the Pali version, 
despite being longer, might represent the closest one to the original 
work. However, more accurate comparisons between the Pali and 


16 "Unsimple examen suffit pour constater que nous avons trois rédactions du méme 
ouvrage qui a été successivement remanié" (Specht, Lévi 1893, 521). 


17 Itis worth noting that the early assumption made by Trenckner that the origi- 
nal language of the text was Sanskrit has been replaced by the hypothesis that it was 
Gandhari. In this regard, see Demiéville 1924, 11; Fussman 1993, 66; von Hinüber 1996, 
83; Kubica 2014, 188; 2021, 430; Baums 2018, 33; Salomon 2018, 26; Levman 2021, 110. 


18 Here,itis worth mentioning the recent contribution of Jonathan Silk (2021), who 
presents some theoretical remarks on how to approach Mahayana Buddhist texts. He 
highlights that if our sources lack a 'unique redactorial moment', it would be impossible 
to recover the Urtext simply because it never existed. However, the versions of our text 
share a common core that is synoptically consistent. In many cases, its analysis helps 
us to move close, if not to the archetype, to the best reading based on evidence (see the 
discussion below). This approach would be appropriate even when considering the re- 
cent contribution of Bryan Levman (2021), who rejects the idea that Pàli and Chinese 
versions were based on a common text but recognises the existence of a common core. 


19 “Both M. Specht and M. Sylvain Lévi seem to think that the two Chinese books 
were translations of older recensions of the work than the one preserved in Pali. This 
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Chinese versions (notably Demiéville 1924; Fussman 1993) suggest 
that the Pàli version is an enlarged version and so, to some extent, 
the Chinese recensions are closer to the archetypal work, if any.” This 
does not mean that it is the original work and in this regard it seems 
useful to quote the words of Gérard Fussman, who, in a succinct but 
very informative passage, provides a sketch for interpreting the rela- 
tionship between the different versions of the text: 


[T]out détail commun au text chinois et au texte pali a chance de 
remonter à la source originale. Tout détail attesté par un seul de 
ces deux textes est suspect d'étre une addition, surtout lorsque ce 
détail se trouve dans la version pali que l'on sait avoir été profon- 
dément remaniée et amplifiée. Il existe néanmoins une possibili- 
té théorique que l'un des deux textes, chinois ou pàli, ait conservé 
une information disparue de l'autre texte; dans ce cas-là il est du 
devoir de l'historien de prouver que cette information remonte à 
la source originale avant de songer à l'utiliser. Enfin, si une infor- 
mation livrée à la fois par le texte chinois et le texte pali doit étre 
décodée ou interprétée pour étre pleinement utilisable, cette in- 
terprétation doit convenir aux deux textes à la fois. J'ajouterai aus- 
si que cette interprétation doit tenir compte du fait que ces textes 
sont des textes indiens, utilisant une phraséologie et des procédés 
littéraires indiens et s'inspirant nécessairement de la conception 
du monde et de l'imaginaire indiens. (Fussman 1993, 68) 


This passage provides many interesting guidelines which can be sum- 
marised as follows: 
1. Details shared by both versions may derive from the origi- 
nal source. 
2. Details attested in only one version may be additions (espe- 
cially if occurring in the Pàli version). 
3. Theoretical possibility that some details survived in only one 
version and disappeared in the other. 


argument does not seem to me, as at present advised, at all certain. It by no means fol- 
lows that a shorter recension, merely because it is shorter, must necessarily be old- 
er than a longer one. It is quite as possible that the longer one gave rise to the short- 
er ones" (Rhys Davids 1894, XII). A similar position is expounded in Rhys Davids 1916, 
632. In this regard, I quite agree with Olga Kubica who wrote that “when Rhys Da- 
vids expressed his opinions concerning Pali literature, his conclusions were very rea- 
sonable, but when Chinese literature entered the discussion, it seems that the desire 
to emphasize the superiority of Pali literature over Chinese prevailed" (2014, 195-6). 


20 "[L]a plus proche de l'original a chance d'étre la chinoise, plus anciennement at- 
testée que le Milindapafiha, et surtout beaucoup moins remaniée que le texte pali y 
compris dans ses parties narratives" (Fussman 1993, 68). A list of reasons according 
to which the Chinese versions should be regarded as an older record than the Pali ver- 
sion is provided by Thich (1964, 24-35) and Guang (2008, 242-3). 
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4. Information that occurs in both versions should be careful- 
ly interpreted and the interpretation should fit both versions. 

5. The Indian context of the text should be taken into account 
during the process of interpretation. 


I would add an additional item to this list, which can be regarded as 
a sort of corollary to the last item, namely that in the case of the Chi- 
nese versions, the Chinese audience should be taken into account. 


2.3 AnExplicative Example. A Buddhist Eristic Dialogue 


In this regard, there is an interesting eristic dialogue between Milin- 
da and Nàgasena survived in both recensions. This episode is report- 
ed in the Pali version as follows: 


The king said: "Venerable Nàgasena, is the Buddha one 

who observes celibacy [brahmacarin, lit. 'one who has the 
brahma-conduct’]?” 

“Yes, great king, the Buddha was one who observes celibacy!” 
“Then Venerable Nagasena, the Buddha is a pupil of Brahma!” 
“Do you have, great king, a state elephant [hatthipamokkho]?" 
“Yes, Venerable, I have it.” 

“Does this elephant, great king, make the trumpet noise 
[koficanada] at times?" 

"Yes, Venerable, it does it." 

"Then, great king, this elephant is a pupil of herons [korica]!" 
“It is not so, Venerable!" 

“And is Brahma, great king, intelligent [sabuddhika] or stupid 
[abuddhika]?" 

"He is intelligent, Venerable!" 

"Then, great king, Brahma is a pupil of the Blessed one 
[bhagavant, epithet of Buddha]!" 

“You are witty Venerable Nagasena”.?! 


In this passage, there are many puns and it is here that this way of 
playing on words seems to bear the weight of a logical argument, al- 


21 raja aha: bhante Nagasena, Buddho brahmacari ti. - ama maharaja, Bhagava 
brahmacari ti. - tena hi bhante Nagasena Buddho Brahmuno sisso ti. - atthi pana te 
maharaja hatthipamokkho ti. - ama bhante, atthi ti. - kin-nu kho maharaja so hatthi 
kadaci karahaci koficanadam nadati ti. - dma bhante, nadati ti. - tena hi maharàja so 
hatthi koficanam sisso ti. - na hi bhante ti. - kim-pana maharaja Brahma sabuddhiko abud- 
dhiko ti. - sabuddhiko bhante ti. - tena hi maharaja Brahma Bhagavato sisso ti. - kallo si 
bhante Nagasena ti (Mil, 75-6). 
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beit there is nothing which appears clearly logical.” The first question 
is put by Milinda to Nagasena, asking ifthe Buddha is a brahmacarin, 
a term that came to mean ‘celibacy’ or ‘chastity’, in the sense of a to- 
tal abstention from sexual intercourse.” However, since this term is 
composed by the words ‘brahma’ and ‘cara’ plus the suffix -in used 
to create adjectives, literally it means ‘one who has the brahma-con- 
duct'. So, according to Milinda, if the Buddha has a brahma-conduct 
this means that he can be regarded as a follower of Brahma (one Indi- 
an god).”* This reasoning is certainly deceptive since it does not take 
into account the real meaning of the word, applying an overly literal- 
istic interpretation. Therefore, Nàgasena answered to the king using 
the same reasoning, showing, at first, that it would lead to ridiculous 
results: as the elephant trumpets (koficanada), this would mean that it 
should be regarded as a follower of herons (kofica). Secondly, he shows 
that it is possible to demonstrate in the same way that the god Brahma 
is a follower of the Buddha because he is intelligent (sabuddhika). 

Now, turning to the Chinese version B, it is possible to note a slight- 
ly different phrasing for the first part concerning the brahmacarin's 
pun, which is meaningful. The Chinese recension reports: 


The king had again a question to Nagasena: "Has the Buddha 
practised the conduct of Brahma, who is the king of the seventh 
heaven, having not any sexual intercourse with women?" 
Nagasena replied: "[The Buddha] keeps himself completely apart 
from women, he is pure, without any flaw or contamination" .?* 


22 It seems, indeed, that we should look at this dialogue in light of the ancient Indi- 
an way of debating. In this regard, and with particular reference to the debates in the 
Milindapafiha, see Analayo 2021a. 


23 See Gombrich 2009, 202-3. Besides, as highlighted by Neri and Pontillo (2014, 
160) the meaning of brahmacariya "cannot be limited to a life of chastity, but includes 
a ‘path of life’ and has other important links with the highest achievements of the Bud- 
dha's path". However, the meaning of 'chastity' is certainly relevant to our context. 


24 Adiscussion on the cult of Brahma in ancient India is provided by McGovern (2012). 
To be thorough, it can be worth mentioning that in addition to the interpretation of the 
stem brahma/a as the Indian god Brahma, another widespread meaning is that of 'ex- 
cellent' or 'foremost' just as when the Pali commentaries gloss brahma as settha. For in- 
stance: brahman ti settham uttamam visittham (Ps, II, 27); brahmabhüto ti setthasabhavo 
(Mp, V, 72); brahmapattiya ti setthapattiya (Spk, I, 265). Or, more specifically on brah- 
macariya: brahmacariyan ti setthatthena brahmabhütam cariyam brahmabhütanam và 
buddhadinam cariyan ti vuttam hoti (Sv, I, 179). In some passages, being a brahmacarin 
is even equated with the attainment of the arahant state: "A pure brahmacarin is a 
monk who has destroyed the noxious influxes (i.e. an arahant)" (suddham brahmacarin ti 
khinasavabhikkhum; Sp, II, 484). According to some canonical passages, the word brah- 
ma-in some compounds can even be synonym with the word dhamma, as in brahma-kaya, 
brahma-bhüta, brahma-yana (cf. Neri, Pontillo 2014, 170-1). 

25 FRIE: "Db In SCA ERETTA Bü SE Gr AST ITER: AI RIE” 
(T1670B.32.0716b05-07), other translations of this passage are provided by Demiéville 
(1924, 158), Thich (1964, 87), Guang (2007, 177), Analayo (2021b, 193). 
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Here, it seems as though the text is trying to explain the pun to the 
reader, by providing the two different meanings of ‘brahma’ (= fan 5) 
involved.?* At first, it is written that the so called ‘brahma-conduct’ 
(fan suóxíng ŽÍT) is referring to the king of the seventh heaven (i.e. 
Brahma)’ and, secondarily, it is specified that the term is referring to 
the fact that there is total abstention from sexual intercourse (jiaohui 
AE fr) with women (fünil %i&).?* Theoretically, we may wonder wheth- 
er it was either the Chinese version that added new material or if it 
was the Pali version that removed these parts. In this regard, it is use- 
ful to remember both Fussman's suggestion, namely, to take into ac- 
count the Indian context, and my suggestion to take into account the 
Chinese audience. Considering these presuppositions, we should ad- 
mit that the word brahmacarin would not require any explanation in 
India since it has formed part of the Indian culture for many years, 
given that it also occurs within the oldest Indian text recorded, i.e. 
the Rgveda (10.109.05). In the same way, we can assume that a native 
Chinese speaker would have some difficulty in grasping the meaning 
of the term, let alone the pun behind the passage. Therefore, the hy- 
pothesis that the Chinese version enlarged the text in order to better 
convey the pun might be more plausible than to suppose that the orig- 
inal work had these kinds of specifications.?? The fact that the Chinese 
version B modified the text to satisfy the Chinese audience is also ev- 
ident from another part of the same account. In this context, it is also 
useful to involve the version A of the Chinese translation. The point 
at issue is the pun based on the trumpet of the elephant (koficanada), 
which would lead to the (il-)logical result that the elephant is a pupil 


26 The interpretation that the Chinese text gives to the stem brahma/à is one among 
the many polysemantic uses and for further details see the seminal article of Neri and 
Pontillo (2014). 


27 Itis possible to compare the reconstruction of the Buddhist cosmology made by 
Gethin (1997, 195; 1998, 117-8) and De Notariis (2019, 66-7) to verify that the seventh 
world above the human realm is actually called Brahma’s retinue (brahmaparisajja), 
see the Appendix. 


28 Theterm suóxíng PT can also be translated as ‘practice’ as in the expression shifa 
suóxíng +47 (T0280.10.0445a28) that Jan Nattier (2007, 113) translates as ‘ten prac- 
tices’. This interpretation of the term would expand its scope beyond mere celibacy and 
would be in line with views of brahmacariya and brahmacarin as expounded in the Pali 
canonical and commentarial literature (see fn. 24 above). This may shed some light up- 
on the need to clarify the locution fan suóxíng 55PrfT that the Chinese redactors had. 


29 Stefano Zacchetti highlighted that often in the process of translating, the text's 
interpretation or, let us say, its exegesis, was actively involved and put into the final 
translated version of the text. In this regard, he writes that: "Forse le prime traduzio- 
ni cinesi non erano lontane da questa situazione: in altre parole, l'elemento originaria- 
mente traducibile sarebbe stato non tanto il sutra, quanto la sua esegesi orale" (Zac- 
chetti 1996, 357-8; Author's transl.: "Perhaps the earliest Chinese translations were 
not far from this situation: in other words, the originally translatable element would 
have been not so much the sütra, as its oral exegesis"). 
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of herons (kofica). In [tab. 1] below, it is possible to compare the Pali 
version with the Chinese versions A and B. 


Table 1 Comparison between the Indian and Chinese versions on the elephant/bird 


song 
Pali version Chinese version A Chinese version B 
“Does this elephant, great king, make — Nagasena asked to the king: “Whatis ^ Nagasena asked to the king: “What is 
the trumpet noise [koficandda] at elephant [xiang £2] song [míngshéng bird [nido ES] song [nidosheng IB] 
times?" "Yes, Venerable, it does it." IR ES] like?" The king replied: “The like?" The king replied: *The bird song 
“Then, great king, this elephant is a elephant song is like the singing is like the singing [shéng ££] of a wild 
pupil of herons [kofica]!” “Itisnotso,  [sheng #2] of a wild goose [yan AF]? goose [yàn JR]? Nagasena said: “If so, 
Venerable!” Nagasena said: “If so, the elephant the bird [nido ES] is the pupil of the 
[xiang £] is the pupil of the wild goose wild goose [yàn fE], but each one of 
[yàn FE], but each one of them is a them is a different species." 
different species." 
kin-nu kho mahārāja so hatthikadáci = FRAMES RIBS LS: ARMS: IRE?” 
karahaci koficanadam nadatiti.-àma — $8 SS Ies PRAE: Fa: ‘FIRE: 
bhante, nadatiti. - tena himahàrájaso "iege ESIEJBSS TTE EEEBSU "We ERBBGSTIRSEGS" 
hatthi koficanam sisso ti. - na hi bhante (T1670A.32.0700c20-22) (T1670B.32.0716b11-13) 
ti. (Mil, 76) 


Concerning this passage, Demiéville believed that the Chinese 
translator(s) did not understand the pun. Therefore, he wrote that: 


Ce passage est corrompu; les copistes chinois ne pouvaient com- 
prendre les jeux de mots (buddha et buddhi, koficanada «barrisse- 
ment» et «cri du héron»), par lesquels Nàgasena réplique à la bou- 
tade étymologique du roi. (Demiéville 1924, 158 fn. 5) 


In my opinion, it is the other way around. The Chinese translator(s) 
modified the passage just because they understood the pun and so 
tried to render it in the best way for their audience. It is worth not- 
ing that the first animal involved in the two Chinese versions is differ- 
ent. In the version A an elephant (xiang &) occurs, whereas version 
B replaces it with a bird (nido 5). In the Pali version, we find an ele- 
phant, just as the Chinese version A. However, the second animal (i.e. 
the wild goose, yan }{§) is the same in both versions, and so also the 
term used to designate the animal's call (míngsheng Ij). We can note 
from the Pali version that the pun is due to the similarity between the 
trumpet koficanada (which literally means the sound nada of the her- 
on kofica), and the herons kofica. A kind of similarity is involved also 
in the Chinese phrasing, but there is not a phonetical similarity as in 
the Pali version, but the similarity is here an ideographic one. It is pos- 
sible, indeed, to note that the combination of characters used to de- 


+ 


note the elephant's trumpet is míngsheng !5 and the second animal 
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involved is yan JS, a wild goose. The characters ming Mj and yan H% 
have something in common, namely they share the same radical: nido 
& (which is, incidentally, the first animal involved in version B in place 
of the elephant of the version A). Here, the pun is more ideographic 
than phonetical as in the Pali version, and this kind of rendition is cer- 
tainly more suitable to a Chinese audience. It is also possible that the 
use of míngsheng 5% to indicate the elephant's trumpet in version A 
was a little forced. This can be inferred by the existence of xiangsheng 
4, probably a more appropriate term to designate the elephant's 
trumpet.” A search into The SAT Taisho Shinshu Daizokyo Text Data- 
base shows that there are sixty-four occurrences for xiangsheng $8, 
and only one more occurrence for xiang míngsheng $E, in addi- 
tion to the occurrences in T 1670 A. It would seem that the choice of 
míngsheng I; to designate the trumpet of the elephant (xiang &) 
is quite peculiar and might have sounded a bit odd. Perhaps, this was 
the reason why the Chinese version B emended the elephant (xiang 
%) with a bird (nido 5). It is indeed possible that for the other Chi- 
nese translator(s) the combination of characters in míngsheng I 
recalled something like a twitter or a chirp rather than a trumpet.** 

Naturally, we cannot definitively exclude the possibility that the 
character nido Ej for xiang & is the result of a hypercorrection by a 
scribe who thought that xiang % must be a mistake, or even (given 
the overall similarity in shape of the two characters) a simple copy- 
ing error. So, in this case we may wonder whether is better to assume 
an ancient dully scribe or skilful one. Similarly, considering in gener- 
al the translation of the entire dialogue, we may wonder whether it is 
better to assume a dully translator - who either did not understand 
the pun or was not able to render it into Chinese - or a knowledgea- 
ble one who skilfully adapted the text to the target audience. Assum- 
ing, for the sake of argument, the latter case, we can read the remain- 
ing part?? of this eristic dialogue in a new light. 


30 Xiangshéng &* would correspond to the Sanskrit nagasvara or nagasabda, see 
Hirakawa 1997, 1105. 


31 Thisfact may support the hypothesis that sees the version B as a more revised ver- 
sion when compared with version A, as sustained by Guang (2009). However, as a re- 
viewer of this paper highlighted, the picture outlined by Mizuno (1959) - who has been 
one of the main Guang's sources - can be much more complex, and so future studies to 
understand and include his findings will be needed. 


32 This section occurs in the middle of the account in the Chinese versions and as 
the last part of the Pali version. 
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Table 2. Comparison between the Indian and Chinese versions of the section 
including the pun based on sabuddhika/abuddhika and yóu niàn &&/wù nian 3E 


Pali version Chinese versions A and B 

“And is Brahma, great king, intelligent Nagasena asked to the king: *The king of 
sabuddhika] or stupid [abuddhika]?" “He the seventh heaven is intelligent [you niàn 
is intelligent, Venerable!" *Then, great AZ] or stupid [wú nidn $&75.]?" The king 


king, Brahma is a pupil of the Blessed one replied: “Brahma, the king of the seventh 

[bhagavant, epithet of Buddha]!” “You are heaven, is intelligent!" Nagasena said: 

witty Venerable Nagasena." “For this reason, Brahma, the king of the 
seventh heaven, as well as all the high 
gods, should be considered a disciple of the 


Buddha [fó (jf ]" 
kim-pana maharaja Brahmà sabuddhiko HPAMT:FtARTAERMR”ES:" 
abuddhiko ti. - sabuddhiko bhante ti. -tena | S& CK ER Be Hos: 
hi maharaja Brahma Bhagavato sisso “EREEREER E RÉCK ES RS 
ti. - kallo si bhante Nagasena ti (Mil, 76) {#8 58 3-1" (T1670A.32.0700c18-20 = 


T1670B.32.0716b08-11) 


On the surface, we may wonder why the translator(s) did not operate 
any change in explaining the connection of sabuddhika and abuddhi- 
ka, translated respectively into Chinese as you nian §% and wu nian 
4, to the word Buddha, in Chinese fo fli. The term nian & has al- 
ways been interpreted as translating the word buddhi, probably, con- 
sidering the evidence from the Pali version and, possibly, underpinned 
by the fact that nian @ as much as buddhi broadly relates to the men- 
tal dimension (as the radical xin 4» of nian & would suggest). The act 
of providing a modern translation to the Chinese passage, in addition 
to the convention of following the Pali version, obscures the fact that 
to the Chinese reader the passage, as it is written, might already con- 
vey the pun. If we step for a moment into the Chinese readers' shoes, 
could we really assume that the first concept that would rise in their 
mind when reading nian & is something alike the Indic term bud- 
dhi? Rather, arguably, other concepts involving the word nian 45 were 
more popular and probably the foremost was niàn # understood as 
one of the practices of ‘recollection’, the first of which is traditionally 
the 'Recollection of the Buddha' (see, for instance, Vism, 197) that in 
Chinese is nian fo <i. In this case, nian & translates the Pali word 
anussati (= Sanskrit: anusmrti) which indicates the systematic exer- 
cise of recollecting or calling to the mind?? something that, in the case 


33 "Because itis a mindfulness (sati) that arises again and again is [called] recollec- 
tion (anussati)" (punappunam uppajjanato sati yeva anussati; Vism, 197). Interesting- 
ly enough, according to Rupert Gethin (2001, 37) "[t]he Milindapafiha contains what is 
perhaps the earliest attempt in Buddhist literature to state fully just what sati is. Ques- 
tioned by king Milinda as to the characteristic (lakkhana) of sati, the monk Nagasena 
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of the Buddhanussati, concerns the Buddha and his qualities.** This 
technique is transversal to many Buddhist traditions and, especially 
along the Silk Road, developed in popular forms that arrived till the 
far East Asia.” Even our Chinese versions B testifies the existence of 
that practice: 


You ascetics say: "People who during their life practices evil for one 
hundred years [can], once approached the time of death, recollect 
the Buddha [nian fo &{] [and so] all of them will after death be 
born in high heavens" .*° 


And also: 


Although a person n been evil in the past, having recollected the 
Buddha [nian fó 25:55] [even] one time, he will therefore not enter 
into the hells but da he gains rebirth in high heavens.” 


This evidence tells us that at the time of the Chinese translation of the 
text the practice of the nian fo 45 was already in existence in the 
cultural milieu in which the text circulated and, so, likely well known 
by the translator(s). Therefore, it is not implausible to think that the 
translator(s) of the text adopted the character nian 7$ since it implic- 
itly recalls the idea of the Buddha thanks to the widespread practice 
of the nian fo 45:95. The Digital Dictionary of Buddhism (DDB) reports 
the possibility that the character nian @ can stand alone with the im- 
plied meaning of nian fo <j, being a sort of its abbreviation.** This 
fact might suggest to us that for one who is acquainted with the prac- 
tice of the recollection of the Buddha, the simple reference to niàn & 
can somehow recall the whole locution nian fó &f which includes the 
term fo fj (= Buddha), given that this is the first among the recollec- 
tions and one that had great success in the religious market along the 


replies that it has both the characteristic of calling to mind (apilapana) and the charac- 
teristic of taking hold (upaganhana)”. Here, Gethin is referring to Mil, 37-8. 

34 "The recollection of the Buddha is the recollection that arises with reference to 
the Buddha" (Buddham arabbha uppanna anussati Buddhanussati; Vism, 197). Essen- 
tially, the practitioner has to recollect the qualities of the Buddha as expressed in the 
famous iti pi so formula (Vism, 198-213). 


35 In Japan, for instance, this practice is known as nenbutsu (2A). A nice overview 
on the Buddhanussati/Buddhanusmrti is provided by Harrison 1992. 

36 W €W pP] E: “A Ze t H E E ED |i Sk PEINE, im Ub PE EK E. 
(T1670B.32.0717b12-13). 
37 AGER ARAS, ASS EMANATE, WERE. (T1670B.32.0717b18-19). 

38 See DDB, s.v. “&” (http://buddhism-dict.net/cgi-bin/xpr-ddb. 
pl?q=%E5%BF%B5) in which, incidentally, is reported a quotation from Frédéric Gi- 
rard: "Abréviation de nianfo &{, acte d’attention de la pensée". 


126 


Bhasha e-ISSN 2785-5953 
1,1,2022, 111-132 


Bryan De Notariis 
The Buddhist Text Known in Pali as Milindapafiha and in Chinese as Nàxiàn biqiü jing ABS¢tb Fr A 


Silk Road. Thus, we can hypothesise that the Pali wordplay of sabud- 
dhika/abuddhika with Buddha is mirrored in the Chinese text by yóu 
nian 4 @/wu nian $&:$ with fo f, assuming a stretched and creative 
interpretation for nian 7$ as implicitly paired with fo fj. Endorsing 
this understanding means to assume the existence of a skilful trans- 
lator, who played with the Chinese characters as much as the Indian 
creator(s) of the text did with the Indic words. 


3 Conclusion 


After some consideration regarding the possible audience and 
author(s) of the text known to us as Milindapanha in Pali and Naxian 
biqiu jing (4% tt EE AE) in Chinese, the relationship between the ex- 
tant versions has been analysed. Beginning with some guidelines to 
compare the Pali and Chinese versions provided by Gérard Fussman, 
a further guideline was suggested, namely, to take into account that 
the Chinese versions were written for a Chinese audience. In order to 
corroborate this point, a passage which involves a pun was analysed, 
showing that the Chinese translator(s) of the text adapted the trans- 
lation in order to satisfy the target audience. This fact can of course 
have some important implications for any attempt to reconstruct the 
archetype, whose very existence could be questioned on the basis of 
some recent publications.” However, the question would undoubted- 
ly deserve further inquiry. What is clear from the present study is that 
we can still scrutinise the extant versions comparing similar accounts 
and reasoning on the differences attested. This effort is not worthless, 
and the lack of certainty about the existence of the archetype has not 
negatively affected the knowledge gained by philologically working 
as if there were one. The findings also have some implications for our 
comprehension of the translator(s)' strategies in adapting foreign In- 
dian Buddhist literature to the Chinese milieu. In the example taken 
into account, the Indian word brahmacarin conveys the ambiguity on 
which the pun is based since it means ‘celibacy’ but literally is ‘brah- 
ma-conduct', and Brahma is also a preeminent Indian god. Howev- 
er, we cannot expect that a non-Indian audience would easily grasp 
the jeu de mots and, indeed, a Chinese version of the passage speci- 
fies that the term is referring to the 'king of the seventh heaven' (i.e. 
Brahma) and to the 'abstention from sexual intercourses' (i.e. celiba- 
Cy). In this regard, during a potential attempt to reconstruct the ar- 
chetype, we should assume that the Pali version conveyed a more re- 
liable reading since the specifications provided by the Chinese version 
are only necessary for a Chinese audience. It is, indeed, part of the 


39 Here,Irefer to Levman 2021 and Silk 2021. 
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very nature of the puns to be understood with ease and immediacy, 
otherwise not only would the humorous intent not be grasped, but al- 
so the general meaning of the passage would remain obscure. There- 
fore, there is little doubt that a pun in a text from Northwest India 
was intelligible for an Indian audience and not for a Chinese one. The 
differences in the exposition concerning the trumpet of the elephant 
(koficanada; míngsheng I5) that would make the elephant (hatthin; 
xiang %5) or the bird (nido E) a follower of the herons (kofica) or wild 
goose (yan fé) should be interpreted in a similar way. Also, on this oc- 
casion, the Chinese translator(s) adapted the text in order to render 
the pun in the best way, using the similarity of the radicals of the char- 
acters (radical nido Ej in ming I% of míngsheng I5 and in yan M). 
The odd choice of míngsheng I5 to designate the elephant's trum- 
pet may have also influenced the substitution of the elephant (xiang 
25) with the bird (nido E) in version B, assuming that míngsheng WE 
would better convey the meaning of a twitter or a chirp than a trum- 
pet. Finally, it has been suggested that the way in which the Indic pun 
based on sabuddhika/abuddhika as recalling the word Buddha (thanks 
to the assonant term buddhi) has been aptly rendered into Chinese in 
a way that preserved the mechanics of the wordplay. The term niàn 
1 used to translate buddhi can similarly recall the Buddha (in Chi- 
nese fó fj), due to the widespread practice of the ‘Recollection of the 
Buddha’ called nian fo &f}. Surely, it would seem hard to demon- 
strate beyond doubt that this is the only univocal interpretation since 
we cannot check into the mind of the ancient translator(s), but this 
hypothesis prompts us to ask at least one question: should we let the 
Pali version level out our reading of the Chinese text? The analysis of 
the Buddhist eristic dialogue proposed in the present study introduc- 
es us into a new, different scenario, one in which the ancient Chinese 
translator(s) did not impersonate the role of a dully translator but act- 
ed skilfully and creatively in presenting sophisticated foreign puns to 
his own audience. All in all, is it not the creativeness we find at the 
very core of any pun? Gérard Fussman is, therefore, certainly right 
in highlighting the need to take into account the Indian origin of the 
text. As a logical corollary, we should also pay special attention to the 
Chinese adaptation and its cultural circumstances. 
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Appendix 


This extremely simplified scheme is based on the reconstruction of 
the Buddhist cosmology made by Gethin (1997, 195; 1998, 117-8) and 
adopted also by De Notariis (2019, 66-7). It is not supposed to be com- 
prehensive, but only aims to highlight as in the seventh realm start- 
ing from that of the humans the deities begin to be called brahma. 


REALM (bhümi) COSMOLOGICAL SPHERE 
World of Pure Form 
(rüpadhatu) 
mahabrahma 


brahma-purohita 


7T brahma-parisajja 


catummaharajika 


6 paranimmita-vasavattin World of the Five Senses 
5 nimmana-ratin (kamadhatu) 

4 tusita 

3 yama 

2 tavatimsa 

1 

0 


Human Being (manussa) 


This scheme is reflected in the Chinese version B 
(T1670B.32.0705a18-19), in which there is evidence that the first 
heaven is that of the four great kings (Pali: catummaharajika = Chi- 
nese: di yi si tian wang # — IRE ‘first [heaven] of the four heavenly 
kings’) and the second heaven corresponds to that of the thirty-three 
[gods] (Pali: tavatimsa = Chinese: di èr dao li tian & JJ RIX ‘second 
[heaven] of the thirty-three heavenly [gods]’). In this regard, see the 
translation of Demiéville (1924, 89). 
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1 Introduction 


The expression of possession has been debated for many decades: 
since the 1980s, several monographs have been published, both typo- 
logical (Seiler 1983; Heine 1997; Stassen 2009), and language-spe- 
cific (see for example, Lehmann 2002 for Yucatec Maya; Taylor 1996 
for English; Mazzitelli 2015 for Belarusian and Lithuanian). Despite 
the numerous contributions that have appeared in recent years, the 
study of possessive constructions continues to present significant an- 
alytical challenges. 

On the one hand, possession is a fundamental domain of human ex- 
perience: possessive constructions can be found in all the languages 
studied thus far and every human being can conceive - even if only intu- 
itively - the difference between ‘what belongs to me’ and ‘what belongs 
to someone else' (Heine 1997). On the other hand, it is very difficult 
to define the semantic and pragmatic parameters that lead scholars to 
collocate under the same label constructions that, from a purely syntac- 
tic point of view, have nothing in common. Indeed, as many typological 
studies on possession clearly show (Heine 1997; Stassen 2009), glob- 
al language variations reveal a multitude of syntactic configurations 
expressing the notion of possession. Some languages (mainly SAE lan- 
guages) use transitive constructions with 'have'-verbs to encode pos- 
sessive notion; other languages use intransitive constructions. Many 
Indo-Aryan languages, for example, lack a verb equivalent to English 
‘have’. They use intransitive constructions with the Possessor marked 
in the oblique case and the Possessee in the nominative. This happens 
for example in Punjabi (Shackle 1972), in Bengali (Thompson 2010) and 
in Marathi (Dhongde, Wali 2009). The same happens in Hindi, where 
possession is encoded mainly through genitive or locative existential 
constructions. To better illustrate this point, let us consider the differ- 
ence among the three sentences below (1)-(3): 


(1) tya-la tin sadr-e ahe-t 
he-DAT three shirt-M.PL be-3PL 
‘He has three shirts’. (MARATHI) 


* The example (including transliteration and glossing) is taken from 
Dhongde, Wali 2009, 197. 


A special thanks to Silvia Luraghi for her precious insights and suggestions. This pa- 
per derives from my MA thesis at the University of Rome "La Sapienza" under the su- 
pervision of Maria Carmela Benvenuto, Flavia Pompeo and Giorgio Milanetti: my sin- 
cere thanks to them too, for their valuable guidance through my years of study. I am 
also grateful to Andrea Drocco for the long discussions we had and for his many in- 
sightful remarks and to the two anonymous reviewers for their comments. Any mis- 
takes are my responsibility entirely. 
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(2  uskepaàs tin kamizem haim 
3sc.LOC(beside) three shirt.F.PL.NOM to be.3PL.PRS 
‘He has three shirts’. (HINDI) 


(3 lui ha tre magliette 
3SG.NOM to have.3SG.PRS three shirt.F.PL 
‘He has three shirts’. (ITALIAN) 


Constructions (1) and (2) are intransitive predications in which the 
first participant is encoded in an oblique case, while the second par- 
ticipant is the syntactic subject; the verb has an existential meaning. 
Note that in this case there is no lexicalisation of the possessive mean- 
ing: the semantics of the construction arises from the global struc- 
ture of the sentence. The third construction is a transitive predication 
where the meaning of possession is lexicalised in the verb (ha ‘has’), 
the first participant is encoded as a subject, and the second partici- 
pant is the direct object of the verb. Despite the syntactic differenc- 
es, these three constructions encode the same possessive meaning. 

Further, important problems also arise from a semasiological point 
of view: what is intuitively identified as a possessive construction can 
frequently be used to express different semantic notions. To illustrate 
this point, let us consider some possessive predicates from English, a 
language with a single possessive verb have which covers much more 
than just the semantics of ownership. 


(4) My doctor has a Volkswagen. 
(5) Maria has a twin sister. 
(6) Everyone has the right to speak. 


(7) Ihave no idea. 


(8) That woman has a lot of courage. 


As we can see, the single English have-construction can express a 
range of semantic possibilities that goes from material alienable pos- 
Session (see sentence (4), where the notion expressed is that of own- 
ership) to abstract possession (sentence (6), where the entity pos- 
sessed is immaterial) and, through a metaphorical extension, also 
covers some sub-domains of the experiential domain (like cognition 
in sentence (7), where the Possessee is a cognitive status). These ex- 
tensive semantic uses of the verb have are not exclusive to the Eng- 
lish language; rather they are quite common (Heine 1997). To further 
complicate matters, the opposite situation can also occur: a language 
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can display two or more constructions for the encoding of the posses- 
sive domain. In consideration of all these factors, a preliminary distinc- 
tion needs to be made between genuine possessive constructions (as 
in the English sentence (4)) and formal possessive constructions (as 
in the English experiential predicate of sentence (7)), in order to keep 
distinct the notion of possession as a cognitive domain from posses- 
sion intended as a linguistic structure (Heine 1997; Langacker 1995; 
Seiler 1983; Keidan 2008). 

In this paper I investigate the syntactic, semantic, and pragmat- 
ic properties of Hindi predicative possessive constructions. Hindi has 
more than one construction for the encoding of possessive relation- 
ships, and I will attempt to show that each construction is customised 
for the encoding of particular semantic properties. The paper is or- 
ganised as follows: $ 2 deals with the formal taxonomies of possessive 
constructions that have been proposed in the last decades. This paves 
the way for the exposition ofthe construal of the domain of possession 
and ofthe consequences that this conceptualisation has for the linguis- 
tic encoding of this notion (8 3). 8 4 analyses the semantic prototype of 
possessive notions. 88 5, 6 and 7 present Hindi data from a semasio- 
logical point of view and analyse the semantic and syntactic properties 
of Hindi possessive constructions. Lastly, $ 8 draws some conclusions. 


2 Formal Distinctions! 


The first and most basic formal distinction is that between attributive 
possession (i.e. ‘Mark’s watch’) and predicative possession (i.e. ‘Mark 
has a watch’ or ‘The watch is Mark’s’). Both types of construction are 
used to express some kind of relation between two entities, but while 
in the former the relation is presupposed, in the latter it needs to be 
established. For this reason, attributive constructions consist of a sin- 
gle NP and the relation is internal to it, while predicative possession re- 
quires two NPs and the relation is mediated by a predicate. Moreover, 
while in attributive possession both the Possessor and the Possessee 
have the same pragmatic role (either the topic or the focus), in pre- 
dicative possession the Possessor is typically (but not always, as we 
will soon see) topical (Mazzitelli 2015, 33). The pragmatic difference 
emerges from the fact that only in attributive possession is the rela- 
tion already given, while in predicative possession this relationship is 
established through a predication. 


1 For an overview of the formal distinctions proposed on possessive constructions, 
see Heine 1997, 1-43 and Stassen 2009, 3-36. 
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The ways in which the relation between the Possessor and the Pos- 
sessee is predicated lead us to two other major distinctions within the 
macro group of predicative possessive constructions: that between as- 
cription of possession and predication of belonging (Heine 1997; Seiler 
1983; Lehmann 2002; Stassen 2009)? and that between have-posses- 
sives and be-possessives (Isacenko 1974; Heine 1997; Keidan 2009). 
An ascription of possession encodes the relation between the two re- 
lata from a possessor-oriented point of view: it takes the Possessor as 
the topical item, while the Possessee is the new information and has 
the role of focus. In predications of belonging, instead, the Posses- 
see is the topic, while the Possessor adds new information: this con- 
struction encodes the relationship from a possessee-oriented point of 
view. Two examples from English are given below (examples (9)-(10)): 


(9) ASCRIPTION OF POSSESSION: Sarah has a red coat. 


(10) PREDICATION OF BELONGING: The red coat is Sarah's / The red coat belongs to Sarah. 


The most important difference between ascriptions of possession and 
predications of belonging is in the definiteness and topicality of the two 
items involved: the presence of an indefinite Possessee and of a topical 
Possessor seems to be the central characteristic of ascriptions of pos- 
session (Heine 1997, 30), whereas in a predication of belonging, the 
Possessee is typically definite, it being the topic of the sentence. How- 
ever, itis worth noting that there is also a relevant semantic difference: 
as Taylor points out, predications of belonging allow "only limited ex- 
tension from the prototype" (Taylor 1989, 205). This means that while 
ascriptions of possession lend themselves to the expression of a large 
range of semantic notions (as exemplified in the English sentences 
(4)-(8)), the use of predications of belonging seems to be restricted on- 
ly to the expression of prototypical possession. We will see, in the para- 
graph dedicated to the Hindi belong-construction (8 6.3), that although 
predications of belonging do not have the same wide semantic function- 
ality of ascriptions of possession, they are not limited to the semantic 
area of prototypical ownership, but can also express different notions. 

The last fundamental distinction is that between the two syntac- 
tic macro-types of h-possessives (i.e. have-constructions) and e-pos- 
sessives (i.e. existential-constructions) (IsaCenko 1974; Keidan 2008). 


2 Terminology varies from linguist to linguist: for instance, Stassen (2009) uses the 
terms indefinite possession and definite possession, while Seiler (1983) and Lehmann 
(2002) prefer the terms ascription of possession and predication of belonging. Heine 
(1997) uses the terms have-constructions and belong-constructions. In this paper we 
will opt for the terms ascription of possession and predication of belonging: Heine's ter- 
minology could be confused with another formal distinction proposed in literature: that 
between have-constructions and be-constructions. 
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Have-possessive constructions are transitive-agentive configurations 
where the semantics of possession is lexicalised in a verb. In have-pos- 
Sessives languages, the Possessor and the Possesse are metaphorical- 
ly related to the prototypical Agent and the Patient of a transitive ac- 
tion: the first is encoded as the subject, while the latter is encoded as 
the direct object of a transitive verb as in the English sentence above 
'Sarah has a red coat'. E-possessives constructions instead are intran- 
sitive constructions with an existential-locative predicate or a copula, 
as in the Hindi example below (11). In these constructions, the Pos- 
sessor is encoded in the oblique case and the Possessee always per- 
forms the syntactic role of subject. 


(11) meri bahan | kepaàs nai sari hai 
lsG.GEN.F  sisterr  LOc(beside) new.F sari.FsG.NOM to be.3SG.PRS 
‘My sister has a new sari’. 


3 The Construal of Possession 


Like any other situation, possession needs to be conceptualised before 
being expressed linguistically. A possessive event always involves at 
least two participants, the Possessor (henceforth PR) and the Possessee 
(henceforth PE). Even though these two participants are co-dependent 
(there can be no PR without a PE and no PE without a PR) they are in 
an asymmetrical relationship. The asymmetry is both semantic - the 
prototypical PR has the PE at its disposal and controls it, but not vice 
versa - and pragmatic - prototypically the PR is topical, while the PE is 
the focus. Normally, the relationship between a PR and a PE is not per- 
ceived through the senses by the speaker - as it happens for example 
in the case of a location, that can be visually perceived - so posses- 
sion is a relatively abstract domain, quite complex to conceptualise. 
According to cognitive studies (Lakoff, Johnson 1980), the concep- 
tualisation of complex cognitive domains usually takes place through 
processes of simplification: complex and abstract domains are associat- 
ed with simpler and more concrete ones through the mental processes 
of metaphor and metonymy. From a linguistic perspective, this means 
that the encoding strategy of a complex situation is based on the for- 
mal expressions of the concrete domains on which the conceptualisa- 
tion is based. Furthermore, as has been demonstrated by many stud- 
ies on the genesis of linguistic expressions (Hopper, Traugott 1993), 
morphological and syntactic elements are the outcomes of processes 
of grammaticalisation of lexical items referring to concrete concepts. 
According to Heine's model (1997), in the genesis of linguistic ex- 
pressions of possessive situations the same mechanism of simplifica- 
tion comes into play: the complex and abstract domain of possession 
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is conceived through more concrete domains. In particular, these sim- 
pler domains provide the basis for the emergence of what Heine (1997, 
46) calls “event schemas”, i.e. conceptual archetypes derived through 
the abstraction of a number of related events experienced through per- 
ception. Heine (1997, 46) states: 


What distinguishes an event schemas from simple concepts in par- 
ticular is that the former are composed of more than one percep- 
tually discontinuous entity. For example, an event schema like "X 
EATS Y" typically contains three entities, which are X, EAT, and Y. 
Simple concepts, on the other hand, consist of no more than one 
entity, even though they may imply the presence of other entities 
in addition. 


From this definition, it follows that formally an event schema has the 
structure of a proposition (and not of a single lexeme), formed by a 
predicate and the arguments associated with it; [tab. 1] shows the eight 
source schemas theorised by Heine (1997).* 


Table 1 Summary table of predicative possessive constructions (Heine 1997, 47) 


Event schema Formula 

Action X takes Y 
Location Y is located at X 
Companion Xis with Y 
Genitive X’s Y exists 

Goal Y exists for/to X 
Source Y exists from X 
Topic As for X, Y exists 
Equation Y is X’s (property) 


According to Heine's proposal, these schemas [tab. 1] are the concep- 
tual archetypes most used by the languages of the world as the basis 
for the construal of possession; some examples are given below. One 
clear instance of the Action Schema is the construction with the verb 
avere ‘to have’ in Italian, as in sentence (12). 


(12) io ho una macchina nuova 
1sc have.1SG.PRS a car.F.SG new.F.SG 
‘I have a new car’. 


3 Fora detailed analysis of each schema, see Heine 1997, 45-76. 
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In this construction, the PR io is encoded as the subject of a transitive 
predication, and the PE, una macchina nuova, as the direct object; thus, 
the two participants are interpreted as an Agent and as a Patient of an 
agentive situation. As Givón (2001, 134) points out, this type of con- 
struction most commonly emerges as a consequence of the semantic 
shift of verbs such as ‘take, ‘grab’, ‘seize’, ‘get’ which lose their origi- 
nal meaning and acquire a bleached meaning of possession. The same 
semantic bleaching process characterised the evolution of the Latin 
verb habere 'to have' (etymologically related to the Italian verb ave- 
re), which derives from the PIE root *g"h,b”-(e)i- meaning ‘to take’ 
(Baldi, Cuzzolin 2005, 29; de Vaan 2008, 277). Thus, in Italian, posses- 
sive relationships are construed through the conceptual archetypes 
of agentive events. Possession is expressed through a nominative-ac- 
cusative construction, e.g. the construction associated with the pro- 
totypical agentive events: the PR is conceptualised as Agent, and the 
PE is conceptualised as Patient. 

A completely different construal ofthe possessive event is at the ba- 
sis of Heine's Location Schema. An example of this type of construc- 
tion is the Hindi sentence in (13): the PR is in the oblique case followed 
by the postposition -ke pas ‘beside’ and the PE is in the nominative and 
agrees in number with the predicate. 


(13) Sita  kepas nai gari hai 
Sita Loc(beside) new.F car.F.SG.NOM to be.3SG.PRS 
‘Sita has a new car’. 


In this type of construction, the PR is conceptualised as the Place where 
the PE is located: the construction is an intransitive predication where 
the PR stands in the locative case, while the PE is the syntactic subject; 
the verb has an existential meaning. Note that, as in the case of the 
Marathi dative construction in example (1), in this sentence there is 
no lexicalisation of the possessive meaning: the semantics of the con- 
struction is projected by the global structure of the sentence. 
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4 The Semantics of Possession 


Although unanimous consensus on the description of possession has 
not yet been reached, most scholars discuss possession in light of the 
theory of prototypes (Seiler 1983; Langacker 1995; Heine 1997; Stas- 
sen 2009). According to them, the domain of possession consists of sev- 
eral notions hierarchically organised around a prototypical one. Thus, 
the primary thrust in the analysis of the semantics of possession has 
focused upon the individuation of the prototypical possessive notion. 
In a cognitive theoretical approach, Langacker (2001; 2009) consid- 
ers possession as a particular instance of the cognitive strategy that he 
calls "Reference-point strategy". Langacker starts from the assumption 
that human beings can create mental access to an indefinite entity, the 
Target, by directing their attention to another more definite entity func- 
tioning as a Reference Point. Thus, for example, in NP 'Mark's watch' the 
watch the speaker refers to is brought to the mind through the evoking 
of 'Mark' as a reference point: 'Mark' is evoked in order to establish 
a mental link with ‘his watch’. Langacker identifies three prototypes: 
ownership, kinship and part-whole relationship, and he maintains that 
their prototypicality is a consequence of the fact that their possessors 
naturally lend themselves to the reference-point function. In these pro- 
totypical cases, the relationship that is used as the basis for the refer- 
ence-point strategy is "objectively construed", meaning that it exists in 
the real world; whereas in non-prototypical cases, the reference-point 
strategy is applied through means of metaphorical or metonymical ex- 
tensions, and it is "subjectively construed" (Langacker 2009, 84). 
Heine (1997) and Stassen (2009), in contrast to Langacker, do not 
focus their analysis on the cognitive function of the possessive con- 
struction, but rather on its semantic parameters. Following an ap- 
proach that can be reconnected to the semantic binary features ap- 
proach, they ultimately detect only one prototypical notion. They 
assume that the fundamental parameters required by a possessive 
relationship are the control that the PR has over the PE, the proxim- 
ity between them and the lack of a temporal limit. Following Taylor 
(1989), Heine proposes a wider range of semantic properties for the 
individuation of the prototypical notion, adding to the parameters list- 
ed above the concreteness ofthe PE and the humanity of the PR (Heine 
1997, 39). Stassen's and Heine's proposals are quite similar: accord- 
ing to them both, ownership* is the prototypical possessive notion 
characterised by the maximum control of the PR over the PE, by spa- 
tial proximity and by the absence of a temporal limit. The further a 


4 Note that the same notion has been identified with different labels: for example, 
Stassen (2009) uses the term "alienable possession", while Heine (1997) calls it "per- 
manent possession". 
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possessive notion strays from this prototype, the less it evinces those 
three fundamental parameters. As a logical consequence of the the- 
oretical differences between Heine and Stassen's approach and Lan- 
gacker's, the conclusions resulting from their studies differ markedly 
from those derived from Langacker's model. For example, inalienable 
possession cannot be considered prototypical, as it is in Langacker's 
model, since there is a lack of control on behalf of the PR, who can- 
not decide to break the relationship, and the proximity between the 
PR and the PE is not necessarily spatial. 

In this paper, I follow Heine and Stassen's proposal, and I assume 
that the notion of ownership as defined below is the prototypical one: 


OWNERSHIP 

An asymmetrical relationship between two entities: a PR which must 
be [- HUMAN] and a PE which must be [-ANIMATE] and [+CON- 
CRETE]. The PR has control over the PE and the relationship has 
no temporal limit. 


The range of notions conceived within the possessive domain varies 
from scholar to scholar: Heine (1997, 34-40) distinguishes seven pos- 
sessive notions, e.g.: permanent possession (i.e. ownership), physical 
possession, temporal possession, abstract possession, inalienable pos- 
session, inanimate inalienable possession, inanimate alienable pos- 
session; Stassen (2009, 16), in contrast, proposes a conceptual space 
of four notions: alienable possession (i.e. ownership) inalienable pos- 
session, temporary or physical possession, and abstract possession. 
In this paper, I assume that one of the fundamental properties of pos- 
session is the humanness of the PR, and following Stassen (2009, 17) 
I exclude inanimate possession from my analysis, considering it to be 
merely a metaphorical extension of possession. 

The following paragraphs focus on the Hindi expression of the four 
possessive sub-domains proposed by Stassen (2009). These possessive 
notions can be described with reference to the different values they 
assume in regard to the properties of control, temporal limit and spa- 
tial proximity. Specifically, they can be defined as follows: 

* Ownership, as defined above. 

* Temporary possession and physical possession: the PR can dis- 
pose of the PE, even if he/she does not own it, as in the sentence 
ʻI have an apartment where I can stay when I spend the night in 
London; it belongs to my uncle'. In physical possession, the PR 
and the PE are physically associated, and the PE is available to 
be used by the PR even though not belonging to him/her, as in ‘I 
have my sister's keys with me'. 

* [nalienable possession: the relationship between the PR and the 
PE is considered to be inherent, and usually the PE is not a ma- 
terial object but a body-part or a person. Moreover, the PR has 
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no control in the relationship: the parameter of control implies 
that the PR can choose when to break the relationship, and in in- 
alienable possession the PR does not have this power. Consider 
the English example: ‘That girl has three brothers’. 

Abstract possession: the PE is an abstract entity or an experien- 
tial state like an emotion or a body-sensation, as in the English 
sentence 'I have a headache'. 


The Hindi Expression of Possession 


Unlike most SAE languages, Hindi does not have a single construc- 
tion that covers all the semantic notions identified in the previous par- 
agraph ($ 4), exemplified by the English sentences (4)-(8). Instead, it 
uses different constructions, each one specialised for the encoding of 
particular semantic features. Let us see the Hindi translation of the 
English sentences (4)-(8) (sentences (14)-(18)): 


(14) 


(15) 


(16) 


(17) 


(18) 


My doctor has a Volkswagen. 
[HUMAN-MATERIAL ENTITY]: [Ownership] > Locative construction 


mere daktar | kepas Volkswagen hai 
1SG.GEN.M  doctorM Loc(beside) Volkswagen.NoM to be.3sG.PRS 


Maria has a twin sister. 
[HUMAN-HUMAN]: [Inalienable possession: Kinship] > Genitive construction 


Maria ki ek jurvam bahan hai 
Maria GEN.F one twin-sister.F.SG.NOM to be.3SG.PRS 


Everyone has the right to speak. 
[HUMAN-ABSTRACT ENTITY]: [Abstract possession] > Dative construction 


sab ko bolne ka adhikar hai 
everyone DAT tospeak.INF.OBL GEN.M right.M.SG.NOM to be.3SG.PRS 


| have no idea. 

[HUMAN-COGNITION]: [Abstract possession: Experience] > Dative construction 
mujhe khabar nahim hai 

1SG.DAT  information.sG.NOM not to be.3SG.PRS 


That woman has a lot of courage. [HUMAN-QUALITY]: [Abstract possession: 
Quality] > Inessive construction 


us aurat mem bahut sahas hai 
thatoBL woman.sc.oBL Loc(in) alotof courage.sG.NOM to be.3SG.PRS 
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This behaviour is typical of Hindi, a language exhibiting a number of 
syntactic patterns selected on the basis of semantic parameters. This 
peculiarity of Hindi has led Montaut (2004a; 2013) to define these pat- 
terns as semantic alignments (following the definition given in Wich- 
mann 2008) rather than syntactic ones. In fact, Hindi clearly encodes 
semantic roles in a rather iconic way and applies each syntactic pat- 
tern to specific semantic features, mainly related to the most salient 
participant and to the type of the event (Montaut 2004b). 

Thus, for example, the use of the transitive-ergative pattern in Hin- 
di is generally restricted to prototypical Agents volitionally acting and 
controlling the scene. When the Agent does not control the event, an- 
other pattern is chosen e.g. the instrumental one.* Similarly, when 


5 Oneofthereviewers highlights that the ergative construction in Hindi has many devi- 
ations from the principle of iconicity. I agree with him/her that ergative marking in Hin- 
di seems to be partly triggered by syntactic features, however the correlation between 
ergativity, syntax and semantics in Hindi is not easy to understand. For example, one of 
the main arguments used to demonstrate that ergativity depends also on syntax is that 
volitional Agents are not marked by ergative postposition if the predicate is expressed 
by compound verbs with intransitive light verbs. However, Drocco (2018) and Drocco, 
Tiwari (2020) showed that when transitive verbs are followed by intransitive light verbs 
(like baithna or jana) "the meaning conveyed is that the Agent-like argument either act- 
ed foolishly, or unconsciously, or lost control over his actions, or was even forced to do 
something against his wishes" (Drocco, Tiwari 2020, 329). While the Agent argument is 
not marked with the ergative case, and this seems to be triggered by syntactic proper- 
ties (since it happens when the light verb is intransitive), we cannot ignore the fact that 
compound verbs' constructions have also semantic consequences and when the light verb 
is intransitive the construction seems to express reduced transitivity. 

Undoubtedly more work needs to be done to understand the correlation between er- 
gativity and iconicity in Hindi, but many arguments can be given to support the thesis of 
iconicity. For example, the single argument of a set of ‘body emission' predicates can be 
optionally marked with the ergative case. When this happens, the ergative case-mark- 
ing encodes a more like prototypical Agent: volitional and in control of the event (Mo- 
hanan 1994; de Hoop, Narasimhan 2005). Moreover, there are clearly contrasting ex- 
amples showing that the ergative marking brings with itself the semantics of agentiv- 
ity, as opposed to other case markings: 


Dative Experiencer vs Ergative Experiencer: 


sahsa use mamréya kesdmne curiyom ki 
suddenly 3SG.DAT shed PsP(in frontof) bracelet.F.oBL PSP.GEN.F 
jhamkar sundi di usne kàn lagakar 

tinkle.F.SG.NOM to hear.PRF.F.SG 3SG.ERG strain the ear.cvB 

sunà ham, koi he 

listen.PRF.M.sG Yes there was someone 


‘Suddenly he heard the tinkle of bracelets outside the shed. He strained his ears and listened. Yes, 
there was someone’. 


Instrumental Agent vs Ergative Agent: 


A tum-him | ne us-kà khün kiya 
2SG-EMPH ERG 3SG-GEN blood  todo.PRF.M.SG 
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the perceiver of a visual or auditory perception is agentive and he/ 
she controls the perception, the pattern selected is the transitive one 
and the perceiver is encoded as the Agent (in the NOM/ERG). However, 
when the perception is not controlled by the perceiver, s/he is encod- 
ed as an Experiencer e.g. in the dative case: the choice thus evolves 
from the semantic parameters of the event. 

Literature on case (Mallinson, Blake 1981; Comrie 1989; Malchu- 
kov 2005; 2015; de Hoop, Narasimhan 2005) generally distinguishes 
two main functions of case-marking: the so-called indexing function 
and disambiguating function. The indexing function uses cases to ex- 
press semantic roles (or specific semantic features of the argument), 
while the disambiguating function uses cases mostly or exclusively to 
mark core arguments and express grammatical relations. Following 
this distinction, Malchukov (2005; 2015) proposes two typological ten- 
dencies determining case-marking cross-linguistically: 

* Iconicity, which implies the “choice of the most semantically fit- 
ting frame" (Malchukov 2005, 85) when encoding semantic roles, 
thus favouring the indexing function. 

* Markedness, which implies the "choice of the transitive frame 
as a major default pattern" (Malchukov 2005, 85) for the expres- 
sion of most events, thus favouring the distinguishing function. 


Languages of the world vary in their ways of ranking these two pa- 
rameters. Languages that rank Iconicity over Markedness are more 
concerned with the faithful encoding of the semantic features of their 
arguments: these languages tend to not extend the use of transitive 
constructions to non-transitive events, because in such languages tran- 
sitive constructions are semantically constrained to prototypical tran- 
sitivity. In contrast, languages that favour Markedness over Iconici- 
ty are more concerned with the differentiation of the two principal 
syntactic elements (the subject and the object) from peripheral argu- 
ments, and therefore they tend to use transitive patterns by default, 
regardless of the semantic properties of the event. 


B sahab maim-ne us-kà khün nahim kiyā mujhse 
sir 1SG-ERG  3SG-GEN blood not todo.PRF.M.SG — 1SG.INS 
ho gayà 


happen.PRF.M.SG 
AB ‘It’s you who murdered him Sir, I did not kill him, it happened by myself (I did it unconsciously)’. 
(Example taken from Montaut 2004a, 211) 


Unfortunately, a discussion on the correlation between ergativity and iconicity in Hindi 
is beyond the scope ofthe present paper; for a thorough investigation upon differential 
subject marking and indexing function in Hindi the reader can refer to de Hoop, Nar- 
asimhan 2005 and Mohanan 1994. For a detailed overview of the study of the interac- 
tion between ergativity and semantic transitivity, see Drocco 2008. 
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With possessive constructions, the interaction between these two 
parameters determines the typological variability between have-pos- 
sessives and be-possessives languages. Languages that rank Marked- 
ness over Iconicity are nominative-accusative languages (like English 
and Italian) that tend to extend the transitive pattern (e.g. Heine's Ac- 
tion Schema) to non-transitive events, thus encoding the participants 
of most events with the nominative and accusative cases, without re- 
gard for the semantic properties of the arguments. Languages that 
rank Iconicity over Markedness, instead, do not extend the transitive 
construction to encode possessive notions, it being specialised for the 
encoding of Agentive Events. 

In Tsunoda's Implicational Hierarchy of Transitivity (Tsunoda 1985; 
2015; tab. 2), possessive events are the most distant from the pro- 
totypical transitive ones:* a prototypical transitive event is dynamic 
and concrete, characterised by an intentionally acting Agent and by 
a Patient that is directly affected in a perceptually salient way (Kittilà 
2002, 190). A possessive event lacks both of these properties: it has 
neither an intentionally acting Agent nor an affected Patient, and in 
fact it is not even a dynamic event, but rather a stative one. This ex- 
plains why the Action Schema is not often employed among the lan- 
guages of the world since most languages encode possession with in- 
transitive sentences having an oblique PR and a nominative PE. Heine 
(1997, 75) points out that remarkably only 13.696 of the languages in 
the world use the Action Schema as their major schema to express 
possession. Only highly nominative-accusative languages (like many 
SAE languages) allow the extension of transitive constructions to sta- 
tive situations by using transitive verbs such as Eng. have or It. ave- 
re that lexicalise the semantics of possession. 


Table2  Tsunoda's Implicational Hierarchy of Transitivity (Tsunoda 2015, 1598) 


Type 1 2 3 4 5 6 7 
Meaning Direct effect Perception Pursuit Knowledge Feeling Relationship Ability 
on patient 
Examples search, know, love, possess, capable, 
wait, understand, like, have, proficient, 
await remember, want, lack, good 
forget need, lacking, 
fond, | resemble, 
fear, similar, 


afraid, correspond, 
angry, consist 
proud, 

boast 


6 Notably, Malchukov (2005; 2015) does not even include possessive verbs in his 
two-dimensional Transitivity Hierarchy. 
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Subtype 1AResultative 1B 2A 2B 
Non- Patient more Patient less 
resultative attained attained 
Examples kill, hit, see, look, 
break, shoot, kick hear, listen 
bend find 


Montaut (2004a; 2004b) points out that the action model (i.e. tran- 
sitive pattern) in Hindi is clearly marginal and it is restrained to ac- 
tion processes where the action chain is fully profiled (imperfective 
aspect). She lists six basic patterns in Hindi which she defines as fol- 
lows (Montaut 2004b, 51): 

1. the nominative accusative diathesis represents action processes; 

2. the ergative diathesis encodes action processes but viewed 
from the viewpoint of the result (aspectual split), and not as 
an action; 

3. the dative diathesis describes experiential processes; 

4. the instrumental diathesis describes non-volitional actions in 
the affirmative and unfeasible actions in the negative, centred 
on actors lacking some of the features of the agent; 

5. the locative and genitive diatheses describe states. 


Montaut states that patterns from 2 to 5 are “absolute construals” (as 
defined by Langacker 1999), where the less salient entity is the start- 
ing point from the linguistic viewpoint and the most salient argument 
is dissociated from the predication and encoded iconically. Following 
Montaut, I propose that predicative possessive constructions in Hin- 
di are realised as absolute predications where the less salient enti- 
ty - the PE - is encoded as the subject, while the PR is dissociated from 
the predication and its case marking is semantically constrained and 
depends on the semantic properties of the relation. This would explain 
the variety of constructions that Hindi uses to translate English pos- 
sessive sentences from (4) to (8): as we have seen each construction 
expresses a different possessive situation. 


6 Presentation of Hindi Data 


In the next paragraphs, I propose a semasiological presentation of 
possessive constructions in Hindi. The examples shown in these par- 
agraphs are taken from a classic of modern Hindi literature, Godan 
by Munshi Premcand, published in 1936. This corpus has been inter- 
rogated through SketchEngine.* 


7 https://www.sketchengine.eu. 
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As scholars of Hindi know well,’ this language uses at least two dif- 
ferent possessive constructions for the encoding of prototypical pos- 
session: both are existential constructions, the first encoding the PR in 
the oblique case followed by the locative postposition -ke pas ‘beside’, 
and the second encoding it in the genitive. First, the locative construc- 
tion (8 6.1), which can express the notion of ownership but can also ex- 
press physical possession and temporary possession, is discussed. In 
$8 6.2-6.4, an analysis of the genitive pattern, which can express the 
prototypical notion of ownership, and which is more frequently used 
for the encoding of inalienable relationships, is presented. In $ 6.5 oth- 
er non-prototypical uses of these constructions are considered. Final- 
ly, 8 7 focuses on the analysis of the dative construction and inessive 
constructions, which can express only abstract possession and do not 
allow the encoding of more prototypical notions. 


6.1 The Locative Construction 


The pattern of the locative construction is as follows: the PR NP is in in- 
itial position, marked in the oblique case and followed by the postpo- 
sition -ke pas, which means ‘beside’ and is normally used for the en- 
coding of locations; next comes the PE, marked in the nominative case; 
and in the final position there is the verb hona 'to be', which agrees 
in number and person (in past tenses also in gender) with the PE and 
has an existential function. This construction can be schematised as 
follows: Y is at X's place » X has, owns Y and can be associated with 
Heine's Location Schema (Heine 1997, 51). 

Before continuing with the exposition, it is essential to briefly ex- 
amine the use of the compound postposition -ke pas to better under- 
stand the examples and their glosses. Like any other compound post- 
position in Hindi, the postposition -ke pas is composed of the simple 
genitive postposition (in this case -ke) followed by an adverb (in this 
case pas ‘near’). When compound postpositions follow a noun, they 
are attached to its oblique form, as in example (19), but when they 
follow a personal pronoun, the possessive form of the pronoun, rath- 
er than its oblique form followed by the genitival postposition, is re- 
quired, as in sentences (20)-(21). 


(19) Mehta kepds saman tojyada na tha 
Mehta Loc(beside) belongings.M.sc.NomM many not tobe.3sG.PST.M 
‘Mehta didn't have many belongings’. 


8 Kachru 1970; Montaut 1997; 2004a; Mohanan 1994; Pandharipande 1981. 


148 


Bhasha e-ISSN 2785-5953 
1,1, 2022, 133-172 


Lucrezia Carnesale 
Predicative Possessive Constructions in Hindi 


(20) lekin mere pas nagad nahim hai 
but  1sG.LOC(beside) cash.sG.NOM not  tobe.3sG.PRS 
‘But | have no cash’. 


(21) hamare pas jo kuch hai vah 
1PL.LOC(beside) REL.ADJ.DIR something.NOM tobe.3SG.PRS CRR.PRN.NOM 
abhi khalihan mem hai 
now barn PsP(in) to be.3SG.PRS 


‘What we have now is in the barn’. 


Note that the possessive construction with the postposition -ke pas is 
formally identical to the Hindi locative construction. In truly locative 
constructions, the location argument is in the oblique case followed 
by the postposition -ke pas (as the PR in the possessive construction), 
with the entity located appearing in the nominative case (as the PE) and 
the predicate being the existential verb hona ‘be’. The most important 
difference between these two sentence-types is the semantics of their 
two arguments: when the argument preceding the postposition -ke pas 
is [+ HUMAN] and the second argument is [-ANIMATE], the resulting 
construction is a possessive one. Remarkably, the semantics of pos- 
session is not lexicalised in a lexical item, but rather it emerges from 
the instantiation ofthe locative construction through these specific se- 
mantic features. If these features are absent, then the resulting con- 
struction has a locative meaning. See the examples below (22a)-(22d):° 


(22a) 1° argument: [HUMAN]; 2° argument: [-ANIMATE]: Possession 
uske pàs qalam hai 
3sc.LOC(beside) pen.sc.NoM to be.3sG.PRS 


‘He has a pen’. 


(22b) 1° argument: [- HUMAN]; 2° argument: [- ANIMATE]: Location 
kitab ke pas qalam hai 
book Loc(beside) pen.SG.NOM to be.3SG.PRS 
‘Next to the book, there is a pen’. 
(22c) 1° argument: [- HUMAN]; 2° argument: [+HUMAN]: Location 
gari ke pàs Sità hai 
car Loc(beside) Sita.NOM to be.3SG.PRS 


‘Next to the car, there is Sita’ 


9 Examples (22a)-(22d), (23a)-(23b), example (43) from $ 6.4 and examples (52)-(53) 
from $ 7.1 are not taken from the corpus. 
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(22d) 1° argument: [+ HUMAN]; 2? argument: [HUMAN]: Location 
Sita ke pās meri bahan hai 
Sita Loc(beside) ^ 1SG.GEN.F Sister.F.SG.NOM to be.3SG.PRS 
‘Next to Sita, there is my sister’. 


Some other semantic and syntactic features also differentiate the pos- 
sessive construction from the locative. In particular: 

1. Onlyin a locative construction can the postposition -ke pas, 
normally meaning 'beside', be exchanged with some other syn- 
onymic postpositions, like -ke bagal mem or -ke nikat 'next, 
near to'. The possessive construction does not allow for the 
interchangeability of -ke pas with other locative postpositions; 
if another locative postposition is selected, the resulting con- 
struction acquires an existential-locative meaning. See the 
contrasting examples below (23a) and (23b). 


(23a) Ram ke pas nai kitab hai 
Ram Loc(beside) new.F book.F.SG.NOM to be.3sG.PRS 
‘Ram has the new book’. 


(23b) Ram ke bagal mem nai kitab hai 
Ram Loc(beside) new.F  book.F.SG.NOM to be.3SG.PRS 
‘Next to Ram there is the new book. *Ram has the new book’. 


This phenomenon may be explained by the fact that the locative con- 
struction instantiated with a [+HUMAN] location and a [-ANIMATE] 
second argument has undergone a grammaticalisation process caus- 
ing the desemantisation of the postposition -ke pas, which, in this con- 
text, has lost its original lexical meaning. 

* Only in a possessive sentence is the element preceding the post- 
position -ke pas endowed with some non-nominative subjects’ 
properties. First, in non-pragmatically marked possessive con- 
structions, the PR is in initial position, whereas in non-marked 
locative constructions, the element followed by the postposition 
-ke pas is preverbal and the subject is in initial position. Second- 
ly, only the PR governs coreference with the reflexive pronoun 
apna: see the example (24). 


(24) mere pas apne dost ki 
1SG.LOC(beside) REFL.ADJ.M.SG.OBL friend.M.SG.OBL — GEN.F 
kitab hai 
book.F.SG.NOM —tobe.3SG.PRS 
‘I have my friend's book’. 
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As mentioned in $ 2, the prototypical information structure of an as- 
cription of possession requires a topical PR with the PE as the com- 
ment. As Keidan (2008, 349) points out, "the topicality we are con- 
cerned with here belongs primarily to the cognitive domain, not simply 
to the grammatical level". An ascription of possession, indeed, encodes 
the relation between the two relata from a possessor-oriented point of 
view, taking the PR as the starting point for the predication of the re- 
lationship. Languages use different ways to mark the topicality of the 
PR: have-possessives languages, for example, raise it to the syntactic 
status of subject. In these languages, the PR is marked as a nomina- 
tive and combines various syntactic properties of subjecthood (Kibrik 
1997; Onishi 2001). As noted above, Hindi is a highly iconic language: 
it uses cases to faithfully encode the thematic properties of the most 
salient element. Moreover, as Montaut (2004b, 51) points out, in Hindi 


the profiled segment always leaves the cognitively more salient 
entity in a secondary position, so that the less salient entity is the 
starting point from the linguistic viewpoint. Hindi indeed shows a 
clear preference for profiling less salient entities as starting points 
in asymmetric relations. 


This means that while Hindi encodes the more salient entity in the sen- 
tence through an iconic use of case marking, it assigns the nominative 
to the less salient entity by default. In ascriptions of possession, the PR 
is always more salient than the PE, as it is prototypically [+HUMAN] and 
the topical element. Consequently, in Hindi possessive sentences the 
syntactic properties of subjecthood are split between the PR and the PE: 
this explains why the PR, even if marked as a locative, is endowed with 
such syntactic properties as initial position in the unmarked sentence 
and the control of coreference with reflexive pronouns and adjectives. 

Let us now move on to the semantics of this type of possessive con- 
struction. As already mentioned above, the locative construction with 
the postposition -ke pas can express the notions of ownership, tempo- 
rary possession, and physical possession. Examples are given in sen- 
tences (25)-(27). This pattern is thus characterised by a certain degree 
of ambiguity: it is only the context that helps us to understand what 
type of possessive notions the construction is encoding. 


10 Remarkably, Montaut (2004b, 51) points out that "full subjecthood is restricted 
in Hindi/Urdu to action phrases and single arguments of simple verbs". Moreover, dis- 
cussing the notion of subject in Hindi, Drocco (2008, 40-1) points out that "l'analisi rel- 
ativa alla determinazione del soggetto in hindi é stata infatti effettuata basandosi non 
tanto sulle proprietà di codifica, bensi sulle proprietà relative al controllo dei diver- 
si processi sintattici" (Author's transl.: “The study of the notion of subjecthood in Hin- 
di has been carried out not through the analysis of the coding properties of the argu- 
ment, but through the analysis of its behavioral properties: i.e. through syntactic tests’). 


151 


Bhasha e-ISSN 2785-5953 
1, 1, 2022, 133-172 


Lucrezia Carnesale 
Predicative Possessive Constructions in Hindi 


(25) hamare pas ilake, mahal, savariyam 
1PL.Loc(beside) land.M.PL.NOM —palace.PL.NOM carriage.F.PL.NOM 
naukar-cakar haim 
servant.PL.NOM to be.3PL.PRS 
‘We have lands, palaces, carriages, servants’. 


(26) hamare pas becne ko bhasa nahim 
iPL.LOC(beside) — tosell.NFoBL X PsP(to) straw.SG.NOM not 
hai 


to be.3SG.PRS 
‘We have no straw to sell’. 


(27) jiske pas jo kuch ho, 
REL.PRN.LOC(beside) ^ REL.ADJ.DIR — INDF.PRN.NOM to be.3SG.SBJV 
nikalkar rakh de 


take.out.cvB to put.3SG.SBJV 
‘Take out what you have and put it here’. 


6.2 The Genitive Construction 


As noted by many scholars of Hindi (Caracchi 2002; Pandharipande 
1981; McGregor 1972; Mohanan 1994), in this language the notion of 
ownership can also be expressed by a genitive construction. The pat- 
tern of this construction is as follows: the PR is in initial position and is 
marked in the genitive case, the PE is in preverbal position marked in 
the nominative case and the predicate is expressed by the existential 
verb hona ‘be’. The verb agrees in number and person (in gender in 
past tenses) with the PE. Note that the PR is marked in the oblique case 
and followed by the genitive postposition -ka (/-ke/-ki) which agrees in 
gender and number with the PE, thus forming an adjectival unit with 
the PR: see examples (28a)-(28c). In particular, the genitive form -ka 
is the masculine singular form, while the masculine plural is -ke; the 
feminine form is -kī and it is the same for both the singular and the plu- 
ral. As in the case of the locative postposition -ke pas, when the gen- 
itive postposition -ka/-ke/-ki follows a personal pronoun, the posses- 
sive form of the pronoun is required, as exemplified in (28d) and (28e). 


(28a) bacc-e k-G dibb-à 
child.M.SG.OBL — GEN.M.SG.DIR box.M.SG.DIR 
‘The child's box’. 


(28b) bacc-e k-e dibb-e 
child.M.sc.oBL GEN.M.PL.DIR box.M.PL.DIR 
‘The child's boxes’. 
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(28c) bacc-e k-r kitab 
child.M.sc.oBL GEN.F book[F.SG] 
‘The child's book’. 


(28d) mer- a dibb-a 
1SG.GEN.M.SG.DIR box.M.SG.DIR 
‘My box’. 

(28e) mer-i kitab 
1SG.GEN.F book[F.sc] 
‘My book’. 


This construction corresponds to Heine's Genitive Schema, summa- 
rised in the formula: X's Y exists » X has Y (Heine 1997, 58). An ex- 
ample is given below (29): 


(29) unki tin larkiyam thim 
3PL.GEN.F three daughter.F.PL.NOM  tobe.3PL.PST.F 
‘He had three daughters’. 


Notably, as in the case of locative constructions, the PR marked with 
the genitive case acquires some syntactic properties of subjecthood: 
it is always in the initial position, and it controls the coreference with 
the reflexive pronoun and coreferential deletion; see the example (30) 
taken from Montaut (2013, 93). 


(30) mera apni bahan se milne dillī 
1SG.GEN.M.SG REFL.F sister COM tomeet.INF.OBL delhi 
jàne ka irada tha 
go.INF.OBL GEN.M.SG intention.M.SG.NOM —be.PST.M.SG 
‘I intended to visit my sister in Delhi’ (Lit. ‘I had the intention to visit my sister 
in Delhi?) 


In many contexts, genitive constructions and locative constructions 
are semantically interchangeable (Mohanan 1994, 178): a sentence 
of the type ‘That man owns a huge house’ can be translated into Hin- 
di with either a genitive construction (example 31) or a locative con- 
struction (example 32): 


(31) us admi ka ek bahut barā 
that.oBL man.SG.OBL GEN.M.SG one very big.M.SG.DIR 
makān hai 


house.M.SG.NOM to be.3SG.PRS 
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(32) us admi ke pas ek bahut barā 
thatoBL  man.sc.oBL Loc(beside) one very big.M.SG.DIR 
makān hai 


house.M.SG.NOM to be.3SG.PRS 


McGregor (1972, 51) suggests that there is a semantic difference be- 
tween the two constructions: while the genitive construction express- 
es a permanent possessive relationship, the locative pattern is used 
for more contingent relationships, e.g. what has been identified here 
as temporary and physical possession. Even if it is true that the loc- 
ative construction can be used to express non-permanent possession 
(as noted in § 6.1), some scholars have shown that such a semantic 
distinction is not applicable. 

Mohanan (1994, 179), for example, points out that both constructions 
can be modified by a subordinate clause of the type “which he is trying 
to sell” thus implying that the genitive construction can also express 
permanent possession. In the same way, both sentences can be modified 
by the clause “which he will hand down to his children”, thus implying 
that locative constructions, too, can encode permanent possession. Pan- 
dharipande (1981) suggests that the selection of the genitive construc- 
tion for the encoding of alienable possession is determined by how the 
relation between the PR and the PE is perceived by the speaker. A pecu- 
liarity of the genitive construction is that most characteristically it in- 
volves concrete entities as PE (like estates, buildings and lands) that are 
normally perceived as being less alienable than the entities frequently 
involved in locative constructions (like money, books, etc.). Moreover, 
the genitive construction expressing ownership typically occurs when 
the PE is perceived to be particularly close to the personal sphere of the 
PR; in this regard, notice the contrasting following examples (33)-(34). 


(33) uske maurūs? pamc bighe khet haim 
3SG.GEN.M.PL inherited five ^ bighe  field.M.PL.NOM to be.3PL.PRS 
‘He has an inherited field of five bighe [Indian unit of measure] 


(34) hamare pas ilake, mahal, savariyam 
iPL.LOC(beside) ^ land.M.PL.NOM palace.PL.NOM — carriage.F.PL.NOM 
naukar-cakar haim 


servant.PL.NOM to be.3PL.PRS 
‘We have lands, palaces, carriages, servants’. 


In sentence (33), the speaker is answering another character who 
asked him whether his family owns land. In his answer, the speaker 
chooses to express the notion of ownership through a genitive con- 
struction: the PR is a peasant family who inherited the PE - aland - and 
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who is particularly attached to it, having cultivated it for generations. 
In sentence (34), the PR is a zamindar. Zamindars were large land- 
owners who did not have the same attachment to the land, as it was 
worked by others. Consequently, in this context the notion of owner- 
ship is expressed by a locative construction. Thus, the genitive con- 
struction is generally associated with a more intimate possessive re- 
lationship, in which the PR is perceived as emotionally attached to the 
PE; on the other hand, the locative construction seems to be used for 
the expression of mere legal ownership. Sentence (35) offers a fur- 
ther interesting example: 


(35) Agar vah ek bigha bhi bec de, to sau mil jayam, lekin kisan ke liye jamin jan se bhr 
pyàri hai, kul-maryada se bhi pyari hai; 
‘If he sold even one bigha of land, he could get a hundred rupees. But to a 
peasant, land is dearer than life, dearer even than family reputation;’ 


aur kul tin hi bigheto uskepàs hai, agar 
and totalNoM three just bighe 3SG.LOC  tobe.3sG.PRS if 

ek . bigha bec de to phir kheti kaise karogā 

one bigha tosell3sc.sBJv then cultivation how to do.3SG.FUT 


‘And he had just three bighe of land, if he were to sell one bigha, how could he 
live off the land?’ 


In sentence (35), the PR is a peasant whose family has been living un- 
der the poverty threshold for a while, and who is considering the idea 
of selling a part of his land to make some money. However, his biggest 
concern derives from the emotional attachment he has with the land, 
notably the author even tells us that ‘to a peasant, land is dearer than 
life’. The context makes it clear that the PE is here felt as strongly con- 
nected with the emotional sphere of the PR. Nonetheless, possession 
is here expressed through a locative construction and not through a 
genitive one. One might think that this example weakens the argu- 
ment according to which the choice of the genitive is based on the in- 
timacy of the possessive relationship, while the locative is used to en- 
code mere legal ownership. However, note that the PR here is thinking 
to sell the land, so if on the one hand the PE is felt as intimately con- 
nected with the PR, on the other hand it is also conceptualised as a 
mere legal ownership, and that could explain the choice for the loca- 
tive marking on the PR. 

However, these semantic explanations are not always adequate to 
explain the preference of a construction over the other: sometimes the 
choice seems to be random. Consider the example in sentence (36), 
in which the speaker encodes three consecutive possessive construc- 
tions. The PR is always the same and it is encoded with the first-person 
plural pronoun, while the PEs are three different material entities: a 
land in the first construction, the crop of that land in the second con- 
struction, and money in the third construction. The possession of land, 
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that should be felt more intimate and less alienable that the posses- 
sion of the crops is encoded through a construction that marks the PR 
as a locative, while the possession of the land's crop which should be 
far more alienable and less connected with the emotional sphere of 
the PR is encoded through a genitive construction.!* 


(36) utnehr khet to hamare pas bhi haim. 
just as much.M.PL field.M.PL.NOM 1PL.LOC(beside) too to be.3PL.PRS 
utni hi upaj hamari bhi hai. phir 
just as much.F crop.F.SG.NOM 1PL.GEN.F too to be.3PL.PRS then 
kyom hamare pas kaphan ko kaurî nahim aur 
why  i1PL.LOC(beside) shroud  PsP(for) cent.SG.NOM not and 
unke ghar nar gay 


3PL.GEN.M house[M.SG.OBL] new.F COW.F.SG.NOM 

ati hai 

to come.3SG.PRS.F 

‘We own field of the same size as his, and we have crops as good as his. Then how come 


that we don't even have a cent to buy a shroud, while they have a new cow in their 
house?' 


Lastly, consider the following example, in which once again the choice 
of the genitive construction seems not to be determined by an emo- 
tional attachment of the PR to the PE. See sentence (37): 


(37) “Mere sir mem jor ka dard ho raha hai. adhd sir esa phatà partà hai, jaise gir 
jayaga." Mehta ne akar kaha [...] “Tumhare sath koi dava bhi to nahim hai?" 
“Kya maim kist marij ko dekhne à rahi thi, jo dava lekar calti?" 
“I have got a terrible headache. My head's bursting as half of it were about to 
drop off.” Mehta walked over to her and said, "[...] Don't you have any pills with 
you?” “Was | supposed to be visiting a patient? Why should I have brought any 


pills?" 

merà ek davaom ka baks 
1SG.GEN.M a  medicine.F.PL.OBL GEN.M box.M.SG.NOM 
hai vah Semri mem hai 
to be.3SG.PRS itis in Semri 


‘I do have a box of medicine, (it is in Semri)’. 


Example (37) is interesting because the reason for the use of a geni- 
tival construction here seems to be pragmatic. As the context makes 
clear, the PR - a doctor who is suffering from a bad headache - is asked 
whether she has a box of medicine with her or not. The speaker an- 


11 Notethatthe sentences in example (36) are pragmatically marked, for this reason 
the order of constituents is here PE-PR-V. 
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swers that she owns a box of medicines, but unluckily she does not 
have it with her (she left it in Semri). As we have seen in § 6.1, the loc- 
ative construction with the postposition -ke pas is characterised by a 
certain degree of ambiguity, since it allows the expression of all types 
of possession - temporary possession, physical possession, and own- 
ership. Here, the speaker chooses to use the genitive construction in 
order to avoid any ambiguity: she wants to emphasise that the pos- 
session she is expressing is not of the physical type. Thus, it must be 
noted that while grammatical tradition of Hindi has mainly highlight- 
ed the semantic implications of the use of the genitive construction, 
sometimes the choice of this type of structure is influenced by prag- 
matic factors.'? 


6.3 Predications of Belonging in Hindi 


As mentioned above, one fundamental distinction holds between as- 
criptions of possession (or have-constructions: i.e. ‘I have a new Sari’) 
and predications of belonging (or belong-constructions: i.e. ‘The new 
sari is mine’). The difference between these two types is pragmatic 
and primarily depends on the information packaging of the sentence. 
Thus far, possessive constructions where the PR is the topical element 
and the PE the comment have been discussed, i.e. how Hindi encodes 
ascriptions of possession with the possessive relationship profiled from 
the point of view of the PR. We now turn our attention to the ways in 
which Hindi expresses a possessee-oriented relation. 

This pragmatic distinction has been claimed to be cross-linguistical- 
ly valid (Heine 1997): every language has constructions that encode 
possessive relationship from the inverse perspective i.e. from the point 
of view of the PE. However, while some languages mark the difference 
between belong-constructions and have-constructions by lexical and 
syntactic means, other languages do not clearly encode this distinc- 
tion. In English, for example, the verb have is used for a possessor-ori- 
ented expression, while the verb belong and the construction X is Y's 
are used to encode a possessee-oriented expression. In contrast, Hindi 
does not distinguish these two types of sentences through such lexical 


12 One of the anonymous reviewers suggests that the genitive construction here is 
used to imply a particular connection with the personal sphere of the PR, and there- 
fore to underline the intimate relation between the PR - a doctor - and the PE - a box of 
medicines (davaom ka baks). I do not fully agree with this interpretation: even if the PR 
is a doctor and this could imply that she feels the PE as closer to her personal sphere, 
I believe that in this specific case the reason behind the use of a genitive construc- 
tion instead of a locative one is to avoid any ambiguity: Dr Malti is saying that she has 
her medical box but at the moment it is not with her. In a non-marked context, a doctor 
would probably use a locative construction to encode the possession of a box of medi- 
cine and s/he would not map this relation as inalienable. 
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and syntactic strategies. In this language, belong-constructions have 
the following structure: the PR is marked with the genitive case and 
the PE is in the nominative; the verb, once again, is honda ‘to be’. This 
construction can be associated with Heine's Equation Schema and it 
is summarised with the following formula: Y is X's (property) > Y be- 
longs to X (Heine 1997, 65). Some examples of predications of belong- 
ing in Hindi are given in sentences (38) and (39): 


(38) jis makan mem rahta ham, vah 
REL.ADJ.OBL  house.oBL  LOc(in) to live.3SG.PRS.M —CRR.PRN.NOM 
ab merà nahim hai 
now 1SG.GEN.M not to be.3SG.PRS 


‘The house | am living in now does not belong to me’. 


(39) Jhuniya ab hamari ho gai 
Jhuniya.NOM now 1PL.GEN.F become.1SG.AOR.F 
‘Now, Jhuniya has become ours (daughter)’. 


Clearly, this construction is quite similar to the genitive construction 
discussed in § 6.2 and used to encode ascriptions of possession, but 
there are some fundamental differences. 

1. First, while in ascriptions of possession the verb honda has an 
existential meaning, in predications of belonging it has a cop- 
ular function: it only connects the PE to the PR and defines the 
tense and mood of the relationship. Indeed, like many other In- 
do-European languages, Hindi uses one verb, hona ‘to be’, for 
two major functions, the copular and the existential-locative. 

2. Second, in predications of belonging, the PR is not endowed 
with any syntactic properties of subjecthood (e.g. initial posi- 
tion, control of reflexive pronouns and adjectives, control of 
coreferential deletion) as it is in ascriptions of possession. So 
notably while in ascriptions of possession the unmarked or- 
der of the constituents is PR-PE-V, in predications of belonging 
the unmarked order is PE-PR-V. What accounts for this is that 
in ascriptions of possession, the raising of the syntactic status 
of the oblique PR is a consequence of its topicality, but as pre- 
viously noted, belong-constructions are characterised by an 
inverse informational structure, where the PE is the topic ele- 
ment. Notice that as a consequence of the topicality of the PE, 
in belong-constructions the PE is always definite. 

3. Third, in belong constructions the PE is always definite and 
known, while ascriptions of possession can encode also posses- 
sive constructions in which the PE is indefinite and unknown. 
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According to some scholars (Taylor 1989; Heine 1997), there is a 
strong correlation between predications of belonging and the con- 
cept of ownership: that is, predications of belonging do not share the 
same wide semantic extension that ascriptions of possession do, and 
they disallow the expression of such notions as temporary possession 
and physical possession. From the analysis of the corpus considered 
here, this correlation does appear to hold in part: when the construc- 
tion involves concrete PEs, generally the notion encoded is that of own- 
ership. However, evidence from Hindi tells us that this construction 
can also express other notions. 

From the analysis of the semantic features of PR and PE of Hindi con- 
structions found in the corpus, it emerges that while the PR is always 
[- HUMAN], the PE can be [ZHUMAN], [ZANIMATE] or [ZCONCRETE]: 
so, this construction can encode other expressions beside ownership. 
In particular, of the 23 occurrences of this construction in the corpus, a 
third encode kinship or social relationships (see example (39)), only two 
instances encode abstract possession (example (40)), and all the other 
instances are expressions of ownership of concrete entities (like house, 
bank, assets, as in sentence (38) above) and of animals (sentence (41)). 


(40) kanün aur  nyày uskà hai, 
law.M.SG.NOM and  justice.M.SG.NOM — CRR.PRN.GEN.M to be.3SG.PRS 
jiske pàs paisa hai 
REL.PRN.LOC(beside) money.SG.NOM to be.3SG.PRS 


‘Law and justice belong to the one who has money’. 


(41) gay meri hogi 
COW.F.SG.NOM 1SG.GEN.F to be.3SG.F.FUT 
‘The cow will be mine’. 


6.4 Genitive Constructions and the Expression 
of Inalienability 


Hindi grammars (Hook 1979; Kachru 2006; Montaut 2004a; Milanet- 
ti, Gupta 2008) systematically associate the genitive construction with 
the notion of inalienability: this pattern is used to express intimate and 
inherent relationships, like kinship and body-part relationships. The 
locative construction, on the other hand, bears no such meaning. Ex- 
amples for the encoding of kinship relationships and body-part rela- 
tionships follow in (42) and (43): 


(42) unki tin larkiyam thim 
3PL.GEN.F three daughters.F.PL.NOM to be.3PL.PST.F 
‘He had three daughters’. 
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(43) us larki ki nili ankhem haim 
thatoBL  girl.F.SG.OBL — GEN.F blue.F eyes.F.PL.NOM to be.3PL.PRS 
‘That girl has blue eyes’. 


The terms alienable and inalienable (and their synonymic alternatives 
inherent vs. established; separable vs. inseparable) appear very fre- 
quently in the literature on possession. The term ‘inalienability’ in- 
dicates that the relationship between the two relata is conceived as 
inherent or indissoluble, like, for example, the relationship between 
family members and the relation between a body part and its posses- 
sor. The contents of the class of inalienable entities vary from culture 
to culture. However, Stassen (2009, 17) points out that when a lan- 
guage has a unique encoding for inalienability, “this encoding will al- 
most always cover at least the relation between a ‘possessor’ and his 
or her body parts, and/or the relation between a ‘possessor’ and the 
members of his or her kinship circle”, thus suggesting that these re- 
lationships are generally considered prototypical examples of inalien- 
ability. The fact that these two types of relationship seem to form the 
core of inalienable possession can be explained by the fact that body 
parts and family members are relational entities in the real world. 
Further extensions of inalienable encoding may then vary from cul- 
ture to culture; in Hindi, for example, not only are blood-kinship rela- 
tions seen as inalienable, but also intimate social relationships such as 
those with a friend or a spouse are usually encoded with the genitive 
construction. Professional relationships, in contrast, are not viewed 
as inalienable and are generally encoded with locative constructions 
(see examples (44) and (45), $ 6.5). 

Over the entire set of 21 sentences with genitive constructions 
found in the corpus, 15 were identified as expressions of inalienabili- 
ty and only six sentences conveyed the meaning of ownership. Moreo- 
ver, it must be noted that no other syntactic schema can express inal- 
ienable relationships in Hindi. Considering this, it appears clear that 
genitive constructions have the unique ability to codify this notion; in 
this regard, Hindi aligns with the typological data presented by Heine 
(1997, 67): “in a number of languages, the Genitive Schema provides 
the primary means of expressing inalienable possession”. 

Recall from $ 6.2 that Hindi genitive postposition forms an adjecti- 
val structure: the genitival postposition -kā (/-ke/-ki) is attached to the 
PRin the oblique case and agrees in gender, number and case with the 
PE, as exemplified in the examples (28a)-(28e). It is significant that the 
genitive construction is used to encode only inalienable possession, 
while alienable possession in Hindi is typically expressed through loc- 
ative construction and does not require agreement. Discussing the use 
of adjectival constructions for the encoding of inalienability in San- 
skrit, Viti (2004) points out that the category of adjective is generally 
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used to express inherent and permanent properties. Moreover, she re- 
marks that the adjective is a relational category both from a syntactic 
point of view - an adjective cannot occur without a head-noun - and 
from a semantic point of view - the value of an adjective depends on 
the noun it modifies. The same can be said about inalienable relation- 
ships: inalienable relationships are seen as inherent and permanent, 
and the inalienable PE is prototypically a relational entity (a family 
member or a body-part). In short, the fact that the genitive construc- 
tion uses a postposition that agrees in gender and number with the 
PE seems to be emblematic of the type of relationship that exists be- 
tween the two relata. We can thus conclude that the Hindi genitival 
construction is iconic: the relational nature of inalienability is formal- 
ly encoded through the use of the relational category of adjectives. 

Given the fact that the genitive construction is specialised for the 
encoding of inalienability, and given the high iconicity of Hindi (8 5), 
one might wonder why this construction allows the expression of 
non-inherent and non-inalienable relationships as ownership. How- 
ever, recall from 8 6.2 that, when expressing ownership, the genitive 
construction is generally associated with a more intimate possessive 
relationship. Moreover, as Pompeo (2010, 42) points out, in some con- 
texts the notion of ownership is similar to that of inalienable posses- 
sion in many respects: the relationship between the relata is particu- 
larly strong and it exists even without spatial proximity. Additionally, 
both relationships require an exclusive association between the PR and 
the PE. Thus, in Hindi, the use of the genitive construction to express 
ownership mirrors the similarity between this possessive notion and 
inalienable relationships. Once again, the choice of the syntactic pat- 
tern is dependent on the semantic properties of the event. 


6.5 Other Uses of the Locative and Genitive Constructions 


The constructions analysed in the previous paragraphs can also serve 
other semantic purposes. As in many other languages, Hindi posses- 
sive constructions can be used to encode non-possessive meanings, 
owing to the mental processes of metaphoric or metonymic extension. 
In the next paragraph, semantic extensions of the locative construc- 
tion will be considered first and afterwards the semantics of the gen- 
itive construction. 

The locative construction can be used to express professional re- 
lationships, as in example (44), ‘We have servants'. Notice that if the 
relationship is not of a professional type, but a more general social 
relationship (as in ‘I also have a friend’, in example (45)) the use of 
the locative construction is disallowed, and the genitive construction 
is employed instead. It is worth noting that once again, the parame- 
ters that influence the choice between the genitive and the locative 
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constructions are the emotional attachment and the intimacy of the 
relationship. 


(44) hamare pas naukar-càkar haim 
1PL.LOC(beside) servant.M.PL.NOM to be.3PL.PRS 
‘We have servants’. 


(45) mera bhi koi hitü hai 
1SG.GEN.M too INDF.ADJ.DIR  friend.M.SG.NOM to be.3SG.PRS 
'| also have a friend’. 


Furthermore, the locative construction can sometimes be used to ex- 
press some metaphorical possessive notions. Specifically, it is allowed 
in expressing the possession of abstract entities such as answer in ex- 
ample (46) or time in example (47); notably, however, these uses are 
conventionalised and do not constitute systematic phenomena. They 
probably derive from metaphorical conceptualisations of abstract en- 
tities as concrete and material ones. For example, according to La- 
koff and Johnson (1980), the metaphor TIME IS MONEYoccurs quite 
frequently across cultures and languages: it is present in English, in 
Italian and also in Hindi. 


(46) dhaniya ke pās javab taiyar tha 
dhaniya Loc(beside) answer.M.sc.NOM ready to be.3SG.PST.M 
*Dhaniya had a ready answer’. 


(47) unkepas lagan thi aur 
3PL.LOC(beside) passion.F.SG.NOM tobe.3SG.Pst.F and 
samay tha 


time.M.SG.NOM to be.3SG.PST.M 
‘He had passion and time’. 


The genitive construction is also systematically used to encode 
non-core possessive notions. In particular, it is frequently used to en- 
code relationships in which one of the two relata is an abstract enti- 
ty, as in the example (48): 


(48) gharvalom ke sath uskà bhi kuch 
family.m.PL.OBL — PsP(with) 3SG.GEN.M also some 
kartavya hai 
responsibility.M.SG.NOM to be.3SG.PRS 


‘He also has some responsibilities towards his family’. 
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Moreover, genitive constructions are also frequently used to encode 
events belonging to the domains of cognition and volition, as exem- 
plified by sentences (49) and (50). Just as in the case of the extension 
of the semantics of locative constructions, a metaphorical explanation 
can also apply in the case of genitive constructions. The experiential 
cognitive domain is, cross-linguistically, intimately connected with the 
domain of possession; in a large number of world languages, an Expe- 
riencer can be encoded as a Possessor in a possessive construction. 
When this happens, the following metaphor is set in motion: EXPERI- 
ENCERS ARE POSSESSORS OF EXERIENCES AND EXERIENCES ARE 
THINGS POSSESSED (Luraghi 2014). This metaphor occurs, for exam- 
ple, in English, in Greek (Benvenuto 2014; Benvenuto, Pompeo 2017; 
Luraghi 2020), in Italian and in Latin (Fedriani 2014) among many oth- 
er languages. Note that the extension of the functionality of the geni- 
tive construction to the expression of experience is far more systemat- 
ic than the extension of locative constructions for metaphorical uses. 


(49) mera is vyavastha par visvas nahim 
1SG.GEN.M this.oBL  system.oBL on  faith.M.SG.NOM not 
hai 


to be.3SG.PRS 
‘I have no faith in this system’. 


(50) unkr yah icchà hai ki 
3PL.GEN.F this.DIR desire.F.sc.Nom tobe.3sc.PRs that 
‘They want that... (Lit. Theirs is the desire that... / They have the desire 
that...)’. 

7 Other Constructions and the Notion 


of Abstract Possession 


In the above exposition of Hindi possessive constructions, an analy- 
sis of dative construction and inessive-locative construction has been 
put aside. As mentioned in § 6, these two constructions are used to 
encode only the notion of abstract possession, and they disallow the 
encoding of prototypical possession. In some Hindi grammars (Hook 
1979; Kachru 2006), these two sentence-types are classified as pos- 
sessive constructions; however, it may be argued that they should not 
be considered as truly possessive, since they are prototypically used 
to express non-possessive situations. A brief discussion of these two 
patterns follows. 
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7.1 The Inessive Construction 


In inessive constructions, the argument in the initial position is marked 
in the oblique case followed by the postposition mem ‘in’, while the 
second argument appears in the nominative case. The predicate is 
once again the verb honda ‘to be’ with an existential function. This con- 
struction is sometimes considered to be specialised for the encoding 
of ‘possession of qualities’. See the example in sentence (51). 


(51) unki patni mem. kyom vahi atmabhiman 
3PL.GEN.F wife.F LOC(in) why thatsame self-confidence.sc.NOM 
nahim hai 
not to be.3SG.PRS 


‘Why doesn’t his wife have that same self-confidence?’ [Lit. ‘Why in his wife 
there is not that same self-confidence?’]. 


The inessive construction can be also used to encode what Heine 
(1997, 35) defines “inanimate inalienable possession”; see the exam- 
ple in (52): 


(52) us ghar mem car kamre haim 
that.oBL house.oBL Loc(in) four  room.M.PL.:NOM  tobe.3PL.PRS 
‘That house has four rooms’ [Lit. ‘In that house, there are four rooms']. 


An argument for the inessive construction not being included in the 
classification of Hindi possessive constructions can be made: this pat- 
tern, indeed, is not used to express prototypical possessive notions as 
defined in this paper ($ 4). The inessive construction in (52), for ex- 
ample, expresses a part-whole relationship rather than a possessive 
one. In many languages, like English, part-whole (or meronymic) re- 
lationships can be encoded through the constructions conventional- 
ised for the expression of possession (as in the sentence 'That house 
has four rooms"); in these languages, these sentences express 'inani- 
mate inalienable possession'. However, in such cases, we are not deal- 
ing with possessive relationships but rather with relational situations 
of a different type that are being conceptualised through possession. 
As Stassen points out, inanimate possession is to be "consider[ed] a 
metaphorical extension of possession, in the same way that the no- 
tion of possession can be extended into the domain of aspect or mo- 
dality" (2009, 17). 

Remarkably, the metaphorical extension of possessive construc- 
tions to part-whole relationships also takes places in Hindi: the sen- 
tence in (52) can also be encoded through a genitive construction, as 
exemplified in (53). 
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(53) us ghar ke car kamre haim 
that.oBL  house.OBL GEN.M.PL four room.M.PL.NOM to be.3PL.PRS 
‘That house has four rooms’ [Lit. ‘Of that house, there are four rooms]. 


The genitive construction in (53) can be interpreted as expression of 
"jnanimate inalienable possession": it expresses a meronymic rela- 
tionship metaphorically conceptualised as possession. Since the se- 
mantics of the genitive construction is more relational than posses- 
sive (8 6.4), it is not surprising that this pattern allows the encoding 
of both possessive relationships and meronymic relationships. In con- 
trast, the same extension is disallowed by the locative-adessive con- 
struction with the postposition -ke pas, which is possessive (and not 
relational) in its prototypical use. 


7.2 The Dative Construction 


In Hindi grammars, dative constructions are sometimes numbered 
among possessive notions and are said to be specialised for the en- 
coding of ‘abstract possession’. In this type of sentence, the most sa- 
lient argument is marked with the dative, a case that in Hindi is pro- 
totypically associated with the Experiencer/Beneficiary, not with the 
Possessor. The second argument is in the nominative and agrees with 
the existential verb hona. Some examples are given below: 


(54) unhem kuch bolne-kà adhikar hai 
3PL.DAT something say.INF.OBL-GEN.M  right.M.SG.NOM to be.3SG.PRS 
‘They have the right to say something’. 


(55) mujhe sir-dard hai 
1SG.DAT headache.sG.NOM to be.3SG.PRS 
‘| have a headache’. 


Once again, it is apparent that these constructions cannot really be 
considered possessive: what emerges from examples (54)-(55) is not 
an expression of possessive events but rather of other types of situa- 
tions. Example (54) illustrates a beneficiary event: the first argument 
should not be seen as a Possessor of an abstract entity, but instead as 
the Beneficiary of a situation; while in (55), the dative construction 
expresses a body sensation: the argument in the dative is an Experi- 
encer, not a Possessor. 

From a typological perspective, the notion of ‘abstract possession’ 
is quite problematic in itself: it is very far from the possessive proto- 
type, as it lacks both control and spatial proximity, and whether it can 
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even be considered as a possessive notion is debatable (Baródal, Dane- 
si 2018). In SAE languages, like Italian and English, it makes sense 
to assume the existence of such a notion, since the encoding strategy 
for the core possessive meaning is also used to express relation with 
abstract relata (as in example (6) 'That woman has courage"). In lan- 
guages like English, then, the relation with an abstract entity, like the 
body sensation in the sentence ‘I have a splitting headache’, is meta- 
phorically interpreted as possession. Note that the notion of abstract 
possession is not completely irrelevant in Hindi: as noted in 8 6.5, 
while dealing with other uses of locative and genitive constructions, 
these patterns can encode relationships in which the first participant 
is [Z HUMAN] and the second participant is [CONCRETE]. In particu- 
lar, the genitive construction is systematically used for the expression 
ofthe experiential domains of cognition and volition which in Hindi can 
be metaphorically conceptualised through possessive relationships. In 
these cases, it makes sense to talk about abstract possession, since 
the constructions under consideration are prototypically associated 
with the encoding of core possessive notions and are metaphorically 
extended to the expression of other situations. This explanation how- 
ever does not work for a possessive interpretation of the Hindi dative 
construction or the Hindi inessive construction: as noted, the dative 
postposition -ko is prototypically associated with the Beneficiary or the 
Experiencer of an event, while the locative postposition mem is proto- 
typically associated with inessive-locative meaning. 


8 Conclusion 


This paper has analysed the expression of core possessive notions in 
Hindi, demonstrating that two syntactic patterns can encode the no- 
tion of ownership, namely, the locative construction with the postpo- 
sition -ke pas, and the genitive construction. The locative construc- 
tion is clearly the more conventional. Locative marking on the PR is 
used to express the whole domain of alienable possession: it can en- 
code the notion of ownership, and it is also the only type of sentence 
that allows the expression of temporary or physical possession. More- 
over, it is highly specific in its semantics: except for the expression 
of professional relationships, it is rarely used for the encoding of oth- 
er situations. 
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ALIENABLE POSSESSION 


OWNERSHIP 


Figure 1 Semantic reconstruction of Hindi locative (possessive) construction 


The genitive construction, instead, has vaguer semantics: its basic 
meaning is relational, not possessive. This construction is the canoni- 
cal vehicle for expressing inalienable relationships, but it can also be 
used to encode ownership when the semantics of the event shows some 
specific relational properties, i.e. when there is a strong connection be- 
tween the PE and the personal sphere of the PR. This explains why only 
the prototypical notion of ownership allows the use of genitive marking 
on the PR: temporary and physical possession are normally not char- 
acterised by an intimate relationship between the two entities. Addi- 
tionally, given its semantic vagueness, the genitive construction allows 
more functional extension. It can also be used metaphorically to encode 
possession of abstract entities and possession of psychological states. 


INALIENABLE RELATIONSHIP ALIENABLE POSSESSION 


BODY-PART 
RELATIONSHIP 


OWNERSHIP 


KINSHIP 


Figure2 Semantic reconstruction of Hindi genitive possessive construction 


The results of this investigation are depicted in the overall semantic 
map of Hindi possessive constructions shown in [fig. 3]. 
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TEMPORARY 
POSSESSION 


PHYSICAL == Ji... 


POSSESSION y nurse diae pictae in a aa ii nr 
INALIENABLE 


RELATIONSHIP 
LalusicRaziazi aiar KINSHIP 


PERCEPTION 


COGNITION 


VOLITION SENSATION 


BODY-PART 
RELATIONSHIP 


EMOTION 


BENEFACTION/ 
MALEFACTION 


MERONYMY 


Figure 3 Semantic maps of possessive constructions in Hindi. In red, the genitive construction; 
in green, the locative construction; in blue, the inessive construction, and in yellow, the dative construction 


List of abbreviations 


1 First person 
2 Second person 
3 Third person 
ACC Accusative 
ADJ Adjective 
AOR Aorist 

COM Comitative 
CRR Correlative 
CVB Converb 

DAT Dative 

DIR Direct 

EMPH Emphatic 
ERG Ergative 

F Feminine 
FUT Future 

GEN Genitive 
INDF Indefinite 
INF Infinitive 

INS Instrumental 
LOC Locative 
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M Masculine 
NOM Nominative 
OBL Oblique 

PE Possessee 
PL Plural 

PR Possessor 
PRF Perfect 

PRN Pronoun 
PRS Present 

PSP Postposition 
PST Past 

REFL Reflexive 
REL Relative 

SG Singular 
SBJV Subjunctive 
V Verb 
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